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GENERAL  INTRODUCTION. 

In  view  of  the  fact  that,  with  the  exception  of  a very  few 
modern  text  books,  the  literature  of  Actuarial  Science  is  con- 
tained in  scattered  original  papers,  The  Actuarial  Society  of 
America  proposes  to  issue  a series  of  small  volumes  upon  im- 
portant actuarial  subjects.  Each  volume  is  intended  to  bring 
together,  as  far  as  space  permits,  the  more  important  points  of 
information  on  the  subject  discussed.  The  objects  in  issuing 
the  series  are  twofold:  (1)  to  assist  students  of  Actuarial  Science, 
and  (2)  to  furnish  a means  of  ready  reference  for  Actuaries.  The 
various  subjects  are  allocated  to  Fellows  of  the  Society  by  the 
Committee  in  Charge;  and,  associated  with  the  principal  con- 
tributor, who  is  primarily  responsible  for  the  matter  included 
and  the  views  expressed,  are  one  or  more  “Associate  Contrib- 
utors.” These  are  appointed  for  the  purpose  of  aiding  and 
criticizing  the  work  before  publication.  It  is  proposed  to  avoid 
discussing  subjects  already  covered  in  the  Text  Book  of  the 
Institute  of  Actuaries  except  as  continuity  of  thought  may  make 
occasional  references  necessary.  The  title  chosen  to  represent 
the  character  of  this  series  is  “Actuarial  Studies.” 

The  thanks  of  the  Society  and  of  the  Committee  in  Charge 
are  due  to  all  the  contributors  who  have  freely  given  of  their 
time  and  labor,  with  the  sole  purpose  of  helping  others — especially 
students. 
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PREFACE. 

The  following  study  is  intended  for  the  assistance  of  students 
preparing  for  the  examinations  of  the  Actuarial  Society.  The 
principal  contributor  desires  to  acknowledge  his  indebtedness  to 
the  Associate  Contributor,  and  also  to  Messrs.  R.  D.  Murphy 
and  H.  H.  Wolfenden  for  a critical  reading  of  the  manuscript, 
and  valuable  suggestions  with  regard  to  arrangement  and 
wording.  It  is  perhaps  appropriate  to  add  here  that  the  refer- 
ences in  the  text  to  Sir  George  F.  Hardy’s  lectures  on  the  “ Con- 
struction of  Tables  of  Mortality,  etc.”  might  have  been  much 
more  numerous  than  they  appear,  more  particularly  in  that  part 
of  the  study  relating  to  “Graduation  by  Mathematical  Formula.” 

R.  H. 


Dec.  6, 1918. 


GRADUATION 

OF  MORTALITY  AND  OTHER  TABLES. 

Introduction. 

1.  There  are  two  general  classes  of  statistical  tables  of  which, 
by  the  nature  of  things,  one  only  can  be  the  subject  of  graduation. 
In  tables  of  one  kind,  the  groups,  regarding  which  information  is 
given,  are  related  to  one  another  merely  as  parts  of  some  larger 
group.  Examples  of  statistical  tables  of  this  kind  are  very  fre- 
quently met.  The  annual  reports  of  companies  of  various  kinds 
are  each  a set  of  statistical  tables  analyzing  the  income  and  dis- 
bursements, assets  and  liabilities  and  other  particulars  of  the 
companies’  business  into  component  parts.  The  abstract  tables 
which  are  published  showing  the  corresponding  items  for  the 
various  companies  furnish  another  example  of  this  class  of  statis- 
tics. Statistics  of  this  kind  are  evidently  not  subject  to  adjust- 
ment of  the  kind  ordinarily  implied  by  the  term  graduation. 
All  that  can  be  done  to  increase  the  weight  of  the  statistics  is  to 
combine  them  with  other  similar  statistics  which  may  be  homo- 
geneous with  them.  For  example,  a statistical  table  showing  the 
proportionate  distribution  of  the  assets  of  all  the  companies  of  a 
given  class  combined  would  be  less  liable  to  fluctuation  from  year 
to  year  than  a similar  table  for  the  assets  of  one  company  only. 
The  various  groups  appearing  in  statistics  of  this  kind  have  no 
serial  relation  with  one  another,  and  may,  therefore,  be  described 
as  non-serial  statistics , or  statistics  of  attributes. 

2.  Tables  covering  data  regarding  various  municipalities  in- 
cluded in  a given  territory  are  in  their  usual  form  examples  of 
non-serial  statistics.  If,  however,  the  various  municipalities  are 
combined  into  groups  according  to  population  or  according  to 
their  place  in  any  other  quantitative  series,  the  resulting  tables 
for  these  groups  will  then  belong  to  the  other  class  of  statistics, 
namely,  serial  statistics  or  statistics  of  variables  as  they  will  form 
in  this  case  a connected  series  in  which  each  group  bears  a special 
relation  to  the  groups  immediately  preceding  or  following  it. 
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Statistics  of  various  kinds  for  successive  calendar  years  belong 
to  the  class  of  serial  statistics.  In  mortality  statistics  in  particular 
an  analysis  of  death  claims  by  cause  of  death  would  be  an  example 
of  non-serial  statistics,  whereas  an  analysis  of  the  same  claims  by 
attained  age  or  by  duration  of  insurance  would  be  an  example  of 
serial  statistics.  Tables  of  the  kind  just  described  showing  the 
relative  frequency  with  which  different  values  of  the  characteristic 
used  as  the  basis  of  analysis  appear  in  a given  group  of  individuals 
are  called  frequency  distributions. 

If  the  events  occur  only  at  isolated  values  of  the  variable 
they  may  be  represented  graphically  by  using  the  variable  as 
abscissa  and  erecting  ordinates  proportionate  to  the  frequencies 
at  those  points.  Such  a diagram  would  be  used,  for  example,  to 
show  how  many  heads  appear  in  a fixed  number  of  tosses  of  a 
coin  in  a series  of  such  experiments.  It  is  customary  to  render 
the  outline  of  the  diagram  more  readily  visible  by  joining  the  ends 
of  the  ordinates  by  straight  lines.  If,  however,  we  are  tabulating 
the  frequency  of  a continuously  varying  quantity,  we  obtain  the 
number  of  occurrences  corresponding  to  an  interval  and  not  to 
isolated  points.  This  may  be  represented  graphically  by  erecting 
a series  of  rectangles,  each  with  its  respective  interval  as  a base 
and  enclosing  an  area  proportionate  to  the  frequency  for  that 
interval.  It  is  usual  to  assume  that  if  the  number  of  cases  ob- 
served were  indefinitely  increased  the  frequencies  of  the  successive 
values  would  form  a continuous  series;  and  the  continuous  curve 
drawn  through  the  ends  of  the  ordinates,  or  such  that  the  ordinate 
for  any  value  of  the  characteristic  is  proportionate  to  the  prob- 
ability of  that  value,  is  known  as  a frequency  curve.  Tables  of 
the  exposed  to  risk  and  of  the  deaths  in  a mortality  experience, 
are  examples  of  frequency  distributions. 

In  the  examples  given  above,  the  range  of  the  possible  values 
of  the  characteristic  is  limited  in  both  directions,  but  in  certain 
cases  the  range  of  values  may  for  practical  purposes  be  taken  as 
unlimited.  This  is  especially  the  case  in  connection  with  the 
statistics  of  any  business  enterprise  or  of  a state  or  country  for  the 
successive  calendar  years,  with  the  additional  special  feature  that 
at  any  particular  time  the  complete  range  of  the  table  cannot  by 
nature  of  things  be  known. 

3.  Many  cases  arise  in  connection  with  the  tabulation  of  statis- 
tical data  where  the  important  element  is  not  the  absolute  numbers 
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in  the  various  groups,  but  rather  the  proportion  which  those 
numbers  in  one  series  bears  to  the  corresponding  numbers  in  a 
collateral  series.  In  the  case  of  a mortality  experience,  for 
example,  where  the  exposed  to  risk  and  the  deaths  are  tabulated 
it  is  ordinarily  as  a means  of  determining  the  ratio,  for  each  age  or 
group  of  ages,  of  the  deaths  to  the  exposed,  or  in  other  words  the 
rate  of  mortality.  In  the  case  of  statistics  of  this  kind  it  is  also 
natural  to  assume  that  the  values  of  these  ratios  would,  if  the 
experience  were  large  enough,  form  a regular  series,  and  this 
assumption  agrees  with  the  results  of  actual  experience,  which 
show  that  the  ratios  arising  from  a large  group  of  observed  facts 
generally  approximate  more  closely  to  a regular  series  than  where 
the  group  is  more  limited.  Tables  of  this  kind  may  be  called  tables 
of  ratios  to  distinguish  them  from  simple  frequency  distributions. 

Reasons  for  Graduation. 

4.  We  have  stated  above  with  reference  to  both  frequency  dis- 
tributions and  to  tables  of  ratios  that  it  is  a natural  assumption 
that  if  the  number  of  cases  observed  were  indefinitely  increased 
the  successive  terms  of  the  resulting  table  would  exhibit  a regular 
progression.  This  is  a particular  case  of  a general  principle  which 
is  ordinarily  expressed  by  the  Latin  proverb  “natura  non  agit  per 
saltum.”  This  assumption  is  not  in  any  way  inconsistent  with 
the  fact  that  the  results  of  any  limited  experience  will  exhibit  a 
very  irregular  progression  due  to  the  fluctuations  arising  from 
the  small  numbers  involved.  As  we  desire  in  most  cases  to  secure 
some  guide  to  the  probable  future  experience  in  similar  cases  we 
must  make  as  near  an  approach  as  we  can  to  a table  showing  the 
results  of  an  unlimited  experience.  It  is  necessary,  therefore,  to 
substitute  a regular  series  for  the  irregular  one  actually  arising. 
This  operation  is  called  graduating  the  series.  (1)  (5)* 

5.  Take,  for  example,  a frequency  curve  under  which  the 
probabilities  corresponding  to  the  successive  intervals  are  ph  p2, 
Pi  and  so  on,  and  suppose  the  total  number  of  cases  observed  is  n. 
The  expected  number  of  cases  in  the  successive  intervals  will  then 
be  nph  np2 , npz  and  so  on.  Consider  then  any  particular  interval 
for  which  we  will  suppose  that  the  corresponding  probability  is  p 
and  consequently  the  expected  number  of  cases  np , the  theory  of 

* The  numbers  appearing  in  brackets  at  the  ends  of  this  and  other  paragraphs 
refer  to  the  publications  listed  in  the  Bibliography  at  the  end  of  the  study. 
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probability  shows  that  the  chance  of  the  actual  number  being 
exactly  np  is  usually  small,  the  mean  square  of  the  difference 
between  the  actual  number  and  the  expected  being  np  (1  — p ). 
The  probability  of  any  particular  value  x of  the  actual  number 
of  cases  is 


-p*(  i - p)*-*. 

3/ 


and  an  analysis  of  the  respective  probabilities  of  the  various 
degrees  of  departure  from  the  expected  shows  that  there  is  approxi- 
mately an  even  chance  that  the  departure  will  exceed  J Vnp  (1  — p) 
(see  note),  that  the  probability  of  the  actual  number  exceeding  the 
expected  is  approximately  equal  to  the  probability  of  its  being 
less,  each  being  approximately  one  half,  and  that  a departure  in 
one  direction  from  the  expected  in  a particular  group  is  as  likely 
as  not  to  be  followed  by  a departure  in  the  opposite  direction  in 
the  next  succeeding  group.  From  these  principles  it  follows  that 
the  actual  series  of  numbers  will  be  an  irregular  one,  but  that  the 
proportion  which  the  irregularities  bear  to  the  actual  numbers 
involved  will  decrease  as  those  numbers  increase.  The  same 
principles  may  be  shown  by  parallel  reasoning  to  apply  to  the 
case  of  a series  of  ratios. 

6.  The  table  on  page  5 illustrates  the  irregularity  due  to  small 
numbers.  It  represents  the  experience  of  the  British  Offices  on 
female  lives  insured  on  the  Ordinary  Life  participating  plan 
excluding  the  first  fifty  years  of  assurance.  The  columns  of  the 
exposed  to  risk  and  the  deaths  are  examples  of  frequency  distribu- 
tions, and  the  marked  irregularity  of  the  figures  in  the  column  of 
deaths  is  worthy  of  note.  This  is  reflected  in  the  irregularity  of 
the  rates  of  mortality  and  in  the  dx  column  of  the  unadjusted 
mortality  table.  The  values  of  qx  do  not  agree  exactly  with  the 
values  of  lx  and  dX)  but  are  based  directly  on  the  exposed  to  risk 
and  deaths. 

7.  The  effect  of  an  increase  in  the  number  of  cases  observed  on 
the  regularity  of  a mortality  table  is  shown  by  a comparison  of  the 
values  of  qx  in  the  above  table  with  the  corresponding  values  in  the 
same  experience  excluding  only  the  first  five  years  and  with  the 
experience  on  male  lives  excluding  the  first  five  years.  The 
table  on  page  6 shows  this  comparison  for  ages  60  to  70  inclusive. 
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Illustrative  Mortality  Experience 


Age. 

Exposed. 

Died. 

Colog  px. 

Log  lx. 

lx. 

d*. 

Qz- 

55 

2 

0 

.0000 

3.0000 

1000 

0 

.000 

56 

4 

0 

.0000 

.0000 

1000 

0 

.000 

57 

11 

0 

.0000 

.0000 

1000 

0 

.000 

58 

19 

0 

.0000 

.0000 

1000 

0 

.000 

59 

31 

1 

.0142 

.0000 

1000 

32 

.032 

60 

48 

1 

.0091 

2.9858 

968 

20 

.021 

61 

58 

3 

.0231 

.9767 

948 

49 

.052 

62 

72 

2 

.0122 

.9536 

899 

25 

.028 

63 

84 

0 

.0000 

.9414 

874 

0 

.000 

64 

100 

4 

.0177 

.9414 

874 

35 

.040 

65 

106 

1 

.0041 

2.9237 

839 

8 

.009 

66 

114 

1 

.0038 

.9196 

831 

7 

.009 

67 

129 

3 

.0102 

.9158 

824 

19 

.023 

68 

132 

5 

.0168 

.9056 

805 

31 

.038 

69 

136 

11 

.0366 

.8888 

774 

62 

.081 

70 

135 

6 

.0197 

2.8522 

712 

32 

.044 

71 

143 

12 

.0381 

.8325 

680 

57 

.084 

72 

140 

10 

.0322 

.7944 

623 

45 

.071 

73 

144 

11 

.0345 

.7622 

578 

44 

.076 

74 

149 

6 

.0179 

.7277 

534 

21 

.040 

75 

154 

16 

.0476 

2.7098 

513 

54 

.104 

76 

150 

24 

.0757 

.6622 

459 

73 

.160 

77 

139 

8 

.0257 

.5865 

386 

22 

.058 

78 

145 

16 

.0508 

.5608 

364 

40 

.110 

79 

140 

13 

.0423 

.5100 

324 

30 

.093 

80 

137 

19 

.0648 

2.4677 

294 

41 

.139 

81 

136 

21 

.0728 

.4029 

253 

39 

.154 

82 

126 

23 

.0875 

.3301 

214 

39 

.183 

83 

126 

26 

.1004 

.2426 

175 

36 

.206 

84 

109 

26 

.1184 

.1422 

139 

33 

.239 

85 

91 

23 

.1265 

2.0238 

106 

27 

.253 

86 

77 

21 

.1383 

1.8973 

79 

22 

.273 

87 

66 

16 

.1206 

.7590 

57 

14 

.242 

88 

54 

12 

.1091 

.6384 

43 

9 

.222 

89 

49 

15 

.1587 

.5293 

34 

11 

.306 

90 

39 

9 

.1139 

1.3706 

23 

5 

.231 

91 

31 

7 

.1112 

.2567 

18 

4 

.226 

92 

27 

6 

.1091 

.1455 

14 

3 

.222 

93 

22 

7 

.1663 

.0364 

11 

3.6 

.318 

94 

15 

2 

.0622 

0.8701 

7.4 

1.0 

.133 

95 

12 

3 

.1249 

.8079 

6.4 

1.6 

.250 

96 

8 

4 

.3010 

.6830 

4.8 

2.4 

.500 

97 

4 

1 

.1249 

.3820 

2.4 

.6 

.250 

98 

3 

2 

.4771 

.2571 

1.8 

1.2 

.667 

99 

1 

1 

00 

1.7800 

.6 

.6 

1.000 
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Comparative  Rates  of  Mortality 


Age.' 

Values  of  qx. 

0^  50). 

on  5). 

0^(5). 

60 

.021 

.026 

.030 

61 

.052 

.028 

.033 

62 

.028 

.029 

.037 

63 

.000 

.032 

.037 

64 

.040 

.031 

.038 

65 

.009 

.035 

.049 

66 

.009 

.038 

.047 

67 

.023 

. .044 

.045 

68 

.038 

.049 

.060 

69 

.081 

.054 

.061 

70 

.044 

.054 

.064 

8.  It  is  a general  principle  of  science  that  in  seeking  to  explain 
an  observed  set  of  facts  that  hypothesis  is  adopted  which,  while 
being  consistent  with  the  general  results  of  collateral  observations 
and  with  the  general  principles  of  nature,  best  explains  the  ob- 
served facts,  or  in  other  words  makes  their  probability  a maximum. 
For  example,  if  we  observed  that  of  1,000  people  of  a given  age, 
eight  die  within  a year,  and  if  we  have  no  collateral  observations  at 
adjacent  ages  with  which  it  is  necessary  to  harmonize  these  results, 
we  adopt  the  hypothesis  that  the  probability  of  death  within  one 
year  at  that  age  is  8/1,000  or  1/125,  because  that  is  the  value  of  the 
probability  which  gives  the  greatest  value  to  the  chance  of  exactly 
eight  people  dying  out  of  1,000  exposed  to  risk.  Where,  however, 
we  have  a series  of  observations  at  consecutive  ages  it  is  necessary 
to  substitute  a smooth  series  for  the  irregular  one  representing 
the  ungraduated  observations.  The  substituted  series  must,  from 
the  nature  of  things,  be  the  result  of  a compromise  between  the 
two  factors  of  smoothness  and  closeness  to  observed  facts.  It  is 
theoretically  possible  to  assign  a basis  for  the  numerical  measure- 
ment of  the  irregularity  of  a series  as  well  as  for  its  departure  from 
the  observed  facts,  and  by  assigning  the  proportion  in  which  an 
increase  in  the  one  is  to  be  taken  as  counterbalancing  a decrease 
in  the  other,  to  arrive  by  a mathematical  process  at  the  series  which 
best  harmonizes  the  two  factors.  On  any  basis  suggested,  how- 
ever, the  resulting  equations  are  numerous  and  unwieldy  to  such 
an  extent  as  to  render  the  process  practically  prohibitive.  Tentar 
tive  processes  are  therefore  necessary.  These  methods  come  under 
four  general  divisions.  Under  the  first  method  a diagram  is 
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made  to  represent  graphically  the  observed  facts  and  a continuous 
curve  is  then  drawn  as  a basis  for  the  graduated  series.  Under 
the  second  method,  the  graduated  series  is  formed  by  interpolation 
on  the  basis  of  values  determined  for  fixed  intervals,  these  values 
being  so  determined  as  to  give  an  interpolated  series  fitting  as 
closely  as  possible  to  the  observed  facts.  Under  the  third  method 
the  individual  terms  of  the  graduated  series  are  each  determined 
by  a summation  of  adjacent  terms  of  the  original  series,  a correc- 
tion being  introduced  to  allow  for  the  differences  of  the  second  and 
higher  orders  in  the  series.  Under  the  fourth  method  a mathe- 
matical formula  containing  arbitrary  constants  is  used  to  express 
the  series  and  the  constants  determined  so  as  to  adhere  as  closely 
as  possible  to  the  observed  facts. 

9.  It  is  evident  that  the  resulting  series  even  after  graduation 
by  any  of  these  methods  is  still  based  on  a limited  number  of 
observations  and  therefore  is  necessarily  affected  by  errors  arising 
from  such  limitation.  The  result  of  the  graduation,  however,  is 
to  distribute  the  errors  arising  at  any  particular  term  over  a 
considerable  range  of  adjacent  terms,  thus  largely  reducing  the 
remaining  errors  by  allowing  positive  and  negative  errors  to 
counterbalance  one  another. 

Criteria  of  a Good  Graduation. 

10.  After  a graduated  series  has  been  constructed,  it  is  usually 
tested  with  respect  to  the  two  points  of  smoothness  and  closeness 
to  the  observed  facts.  With  respect  to  smoothness  the  fact  that 
a series  is  determined  by  a mathematical  formula  is  usually  taken 
as  a sufficient  test,  but  when  it  is  not  so  determined  the  criterion 
usually  adopted  in  this  respect  is  the  smallness  of  the  third  differ- 
ences in  the  graduated  series.  This  smallness  is  sometimes  tested 
by  inspection  of  the  differences  after  they  have  been  taken  out, 
but  in  comparing  two  different  graduations  of  the  same  series, 
if  it  is  desired  to  have  a numerical  measure  of  the  smoothness  of 
each,  the  sum  of  the  squares  of  the  third  differences  in  the  different 
sections  of  the  table  may  be  taken  as  such  measure. 

11.  With  respect  to  closeness  to  the  observed  facts,  the  require- 
ments usually  made  are  (1)  that  the  total  number  and  the  first 
and  second  moments*  about  any  assigned  point  shall  be  approxi- 

* The  nth  moment  about  any  assigned  point  is  the  su  m of  the  nth  powers 
of  the  values  of  the  variable  measured  from  that  point  as  origin  each  multi- 
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mately  the  same  in  the  graduated  series  as  in  the  ungraduated, 
and  (2)  that  the  departures  in  individual  groups  shall  not  on  the 
average  materially  exceed  in  magnitude  those  expected  in  accord- 
ance with  the  theory  of  probability.  In  the  case  of  mortality 
tables  these  comparisons  are  usually  based  on  the  expected  deaths 
as  obtained  by  multiplying  the  number  exposed  to  risk  by  the 
rate  of  mortality  at  the  individual  ages.  The  comparison  is 
usually  made  by  recording  the  difference  between  the  actual 
deaths  at  individual  ages  and  the  expected.  A continuous  sum- 
mation with  due  regard  to  sign  of  these  deviations  is  then  made. 
The  smallness  of  the  numbers  in  this  column  of  accumulated 
deviations,  the  frequency  in  change  of  sign  and  the  extent  to 
which  positive  and  negative  terms  balance  one  another  form  the 
tests  of  the  closeness  of  agreement  in  the  total  number  and  the 
first  and  second  moments.  This  follows  from  the  fact  that  the 
successive  moments  may  each  be  expressed  in  terms  of  summa- 
tions, so  that  if  the  summations,  up  to  a certain  order,  in  two  series 
are  equal,  all  moments  which  may  be  expressed  in  terms  of  those 
summations  are  also  equal.  The  number  of  summations  required 
is  one  more  than  the  highest  order  of  moments,  since  the  first 
summation  merely  determines  the  total  number.  (4)  (5) 

12.  From  the  principles  of  the  theory  of  probability,  it  is  known 
that  where  p is  the  probability  of  an  event  happening  and  q the 
probability  of  its  not  happening  at  a given  trial,  the  average  devia- 
tion  from  the  mean  irrespective  of  sign  in  n trials  is  approximately 
£ V npq  when  n is  large  while  the  mean  value  of  the  square  of  the 
departure  in  the  same  number  of  trials  is  npq.  The  magnitude  of 
the  individual  departures  can,  therefore,  be  tested  by  comparing 
the  sum  of  the  individual  departures  irrespective  of  sign  with  its 
expected  value,  as  derived  from  the  above  formula,  or  the  com- 
parison may  be  based  either  on  the  sum  of  the  squares  of  the  de- 
partures or  on  the  sum  of  those  squares  each  divided  by  its  mean 

plied  by  the  proportionate  frequency  of  the  value.  In  the  language  of  the  in- 
finitesimal calculus  the  nth  moment  about  the  point  x = a is 


y(x  — a)ndx 


when  y represents  the  relative  frequency  of  the  value  x.  (See  also  articles  57 
to  60  inclusive.) 
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value.  The  latter  test  is  the  one  used  by  Prof.  Karl  Pearson  to 
measure  the  goodness  of  fit  where  a frequency  curve  has  been 
applied  to  the  graduation  of  statistical  tables.  Its  use  is  sup- 
ported logically  by  the  fact  that  the  quantity  so  arrived  at  is 
approximately  proportional  to  the  logarithm  of  the  ratio  between 
the  probability,  on  the  basis  of  the  graduated  table,  of  the  observed 
facts  and  that  of  the  results  expected  according  to  that  table. 

Graphic  Method. 

13.  The  graphic  method  of  graduating  statistical  tables  arises 
naturally  from  the  graphic  method  of  representing  the  tables. 
It  applies  to  frequency  distributions  as  well  as  to  tables  of  ratios. 
Under  this  method  the  items  of  a table  are  represented  by  points 
in  a diagram.  For  convenience  in  plotting  the  diagram  accurately 
ruled  section  paper  is  ordinarily  used,  the  number  of  squares 
measured  along  a selected  base  line  being  taken  to  represent  the 
argument  of  the  table  and  the  value  of  the  function  being  repre- 
sented on  a suitable  scale  by  the  distance  of  the  point  from  the 
base.  When  the  points  corresponding  to  the  successive  values  of 
the  argument  are  plotted  and  joined  by  straight  lines  it  is  found 
that  the  result  is  a zigzag  line  full  of  minor  irregularities,  but 
showing  indications,  the  strength  of  which  depends  on  the  volume 
of  the  observations,  of  an  underlying  regular  law.  The  graduation 
of  the  series  is  effected  by  drawing  among  these  points,  but  not 
necessarily  through  any  of  them,  a regular  curve  representing  this 
law.  Preliminary  groupings,  not  necessarily  covering  equal  inter- 
vals but  so  arranged  as  to  produce  the  greatest  attainable  regu- 
larity, may  be  made  in  order  to  bring  out  this  law.  In  some  cases 
where  the  observations  are  few  two  different  groupings  of  the 
same  material  may  indicate  different  curves.  In  that  case  col- 
lateral information  should  be  used  to  determine  which  to  follow. 
After  the  curve  is  drawn  the  values  of  the  ordinates  are  read  off 
and  the  results  corrected  to  remove  any  irregularities  due  to  errors 
in  reading.  A comparison  is  then  made  between  the  graduated 
series  and  the  original  data  and  the  series  is  amended,  if  necessary, 
to  bring  the  variation  between  the  two  within  the  limits  considered 
permissible. 

14.  In  the  case  of  a mortality  table  this  comparison  is  ordinarily 
made  by  computing  the  expected  deaths  by  the  graduated  table 
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and  recording  the  difference  with  due  regard  to  sign,  between  the 
actual  deaths  and  the  expected.  The  deviations  for  the  successive 
ages  are  then  summed  continuously,  forming  a column  of  accumu- 
lated deviations.  If  a relatively  large  and  persistent  deviation  in 
either  direction  is  accumulated  in  any  section  of  the  table,  the 
series  is  amended  to  reduce  or  eliminate  it. 

15.  The  following  extracts  from  Dr.  Sprague’s  description  of 
the  application  of  this  method  will  indicate  the  considerations 
which  arise.  (10) 

“.  . . In  order  to  understand  and  successfully  apply  the  graphic  method 
of  graduation,  it  is  necessary  to  study  carefully  the  relations  that  exist  between 
the  progression  of  the  numbers  and  their  differences,  and  the  form  of  the  curve. 
If  we  have  any  smooth  curve,  and  take  the  values  of  the  ordinates  at  equal 
small  intervals  along  the  base  line,  and  then  find  the  first  and  second  differences 
of  these  values,  we  may  easily  satisfy  ourselves  that,  if  the  first  differences 
are  all  positive,  the  curve  continually  recedes  from  the  base  line:  and  if  they 
are  negative,  the  curve  approaches  that  line;  also  that  if  the  second  differences 
are  positive,  the  curve  has  its  convexity  turned  to  the  base  line;  and  if  they 
are  all  negative,  it  is  concave  to  that  line. 

First  Differences. 

Second  “ 


Form  of  the  curve 


— -> 

There  are  thus  four  cases,  which  are  represented  in  the  above  woodcut.  As 
a specimen  of  the  first,  we  may  take  the  series  formed  by  the  squares  or  cubes 
of  the  natural  numbers;  for  the  second,  we  may  take  the  series  formed  by 
the  square  or  cube  roots  of  those  numbers;  for  the  third,  the  series  formed  by 
the  reciprocals,  either  of  the  numbers  or  of  their  squares  or  cubes;  and  those 
students  who  are  not  already  familiar  with  the  propositions  I have  stated, 
cannot  do  better  than  verify  them  by  taking  numerical  examples.  It  will 
be  found  an  instructive  exercise  to  extract  from  Barlow’s  very  useful  tables, 
numbers  of  the  kind  I have  described,  then  to  calculate  their  first  and  second 
differences,  and  lastly  to  plot  down  on  cross-ruled  paper  the  curve  that 
corresponds  to  each  different  series  of  numbers.  As  a specimen  of  the  fourth 
form  of  curve,  we  may  take  the  series  of  numbers  obtained  by  subtracting  the 
squares  of  the  natural  numbers  from  a fixed  number;  for  instance,  by  sub- 
tracting the  squares  of  1,  2,  3,  . . . , 9 from  100,  we  get  the  series  of  numbers 
99,  96,  91,  84,  75,  64,  51,  36,  19:  the  first  differences  of  these  are,  — 3,  — 5, 
— 7,  — 9,  — 11,  — 13,  — 15,  — 17;  and  the  second  differences  are  all  equal 
to  - 2.” 


Positive 

Positive 


Positive 

Negative 


Negative 

Positive 


Negative 

Negative 
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“If  we  find  that  the  series  of  first  differences  changes  from  positive  to 
negative,  then  the  curve,  which  was  receding  from  the  base  line,  has  a maxi- 
mum point,  and  begins  to  approach  that  line;  and,  on  the  contrary,  if  the 
change  is  from  negative  to  positive,  the  curve,  which  was  approaching  the 
base  line  has  a minimum  point,  and  begins  to  recede  from  that  line.  These 
two  forms  of  the  curve  are  shown  in  the  appended  woodcut,  A and  B being 
the  maximum  and  minimum  points  respectively.  The  student  will  find  several 
examples  of  this  kind  in  the  two  series  of  graduated  probabilities  given  by 
Mr.  Higham  on  pages  20  and  21  of  his  paper.” 


“If  the  second  differences  change  sign,  the  curve  has  what  is  called  a point 
of  inflection,  or  a point  of  contrary  flexure;  that  is  to  say,  up  to  a certain  point 
it  is  convex  to  the  base  line,  and  at  that  point  it  changes  its  direction  and 
becomes  concave,  or  the  contrary.  If  the  second  differences  change  sign  from 
positive  to  negative,  the  curve  is  first  convex  to  the  base  line  and  afterwards 
concave,  as  is  shown  in  Figures  (1)  and  (2)  of  the  following  woodcut;  while 
if  the  differences  change  from  negative  to  positive,  the  curve  passes  from 
concave  to  convex,  as  shown  in  figures  (3)  and  (4)  of  the  woodcut.  Many 
examples  of  this  kind  occur  in  Mr.  Higham’s  second  series  of  adjusted 
probabilities.” 


“The  student  will  easily  be  able  to  satisfy  himself  that  there  must  always 
be  a point  of  inflection  between  a maximum  and  a minimum  point  in  the 
curve.  In  the  neighborhood  of  a maximum  point  the  curve  is  concave  to 
the  base  line,  and  the  second  differences  are  negative;  while  in  the  neighbor- 
hood of  a minimum  point  the  curve  is  convex  to  the  base  line,  and  the  second 
differences  are  positive.  Hence,  in  passing  from  a maximum  point  to  a 
minimum,  the  second  differences  change  sign  from  negative  to  positive,  which 
proves  that  there  is  a point  of  inflection  between  them.  Thus,  in  the  diagram 
above,  there  is  a point  of  inflection,  C,  between  the  maximum  point,  A, 
and  the  minimum  point,  B.” 

“When,  . . . the  intervals  at  which  the  ordinates  are  taken  are  not  all 
equal,  all  that  has  been  said  regarding  first  differences  still  holds  good,  but 
the  statement  requires  modification' as  regards  the  second  differences.  When 
the  interval  is  constant,  the  first  difference  forms  a measure  of  the  rate  at 
which  the  curve  recedes  from  the  base  line  (or  approaches  it);  but  when  the 
iqtervals  are  of  unequal  magnitude,  it  is  clear  that,  in  order  to  get  a proper 
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measure  of  the  rate  at  which  the  curve  recedes  from  (or  approaches  to)  the 
base  line,  we  must  divide  each  first  difference  by  the  interval  to  which  it 
relates.  . . . These  numbers,  then,  which  I call  divided  first  differences, 
measure  the  rate  at  which  the  curve  recedes  from  (or  approaches  to)  the  base 
line.  So  long  as  this  rate  increases,  the  curve  is  convex  to  the  base  line;  and 
when  this  rate  diminishes,  the  curve  is  concave  to  the  base  line.  We  have 
therefore  to  take  the  differences  of  the  divided  first  differences,  . . . and 
when  these  are  positive,  the  curve  is  convex  to  the  base  line;  and  when  they 
are  negative,  the  curve  is  concave.  The  number  of  changes  of  sign,  therefore, 
indicates  the  number  of  points  of  inflection  in  the  curve.  . . .” 

“Pursuing  the  same  course  with  the  figures  . . . which  relate  to  ages 
over  70,  we  see  that  the  final  column  exhibits  15  changes  of  sign,  corresponding 
to  an  equal  number  of  points  of  inflection  in  the  curve;  and  ...  we  have  now 
to  substitute  for  this  irregular  curve  one  which  shall  present  no  points  of 
inflection,  so  that  when  we  take  the  corresponding  probabilities  of  dying  at 
each  age  and  form  the  second  differences,  these  shall  exhibit  no  changes  of 
sign.  It  is  here  that  the  graphic  method  comes  in.  We  have  ...  22 
points  marked  on  our  sheet  of  cross-ruled  paper;  and  we  must  actually  draw 
the  best  curve  we  can,  that  shall  pass  as  near  those  points  as  practicable, 
above  some  and  below  others,  but  without  exhibiting  any  point  of  inflection.” 

“I  may  here  remark  that,  when  our  points  indicate  a curve  free  from  sudden 
changes  of  direction,  and  with  its  curvature  in  the  same  direction  throughout, 
theoretical  considerations  teach  us  that  the  curve  should  not  pass  exactly 
through  the  points,  but  a little  outside  them.  It  does  not  seem  desirable 
to  give  here  a formal  proof  of  this  proposition;  but,  for  the  benefit  of  those 
who  may  be  disposed  to  investigate  the  matter  for  themselves,  it  may  suffice 
to  state  that,  having  regard  to  the  way  in  which  our  points  are  obtained,  if 
we  assume  that  the  rate  of  mortality  increases  by  constant  first  differences 
through  each  of  the  intervals  we  have  dealt  with,  the  unadjusted  mortality 
curve  will  form  a polygonal  figure  ABC  . . . the  sides  of  which  will  be 
bisected  by  our  points,  P,  Q,  . . .;  and  our  adjusted  curve  must  be  drawn 
so  that  its  area  is  approximately  the  same  as  that  of  the  polygon,  as  shown 
in  the  subjoined  figure.  The  distance,  however,  between  the  curve  and  the 
points  is  so  small,  that  it  may  be  generally  disregarded,  and  the  curve  may  be 
drawn  through  the  points.” 


“.  . . The  next  step  is  to  note  the  points  in  which  it  cuts  the  vertical 
lines  corresponding  to  the  various  ages,  and  to  estimate  the  length  of  the 
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ordinate  for  each  age.  . . . There  is  always  a liability  to  error  in  estimating 
the  tenths  of  an  interval  and  small  errors  may  also  arise  from  inequalities  in 
the  ruling  of  the  paper,  or  from  the  curve  being  drawn  unsteadily.  In  order 
to  remove  these  errors  I difference  the  quantities,  and,  when  I find  the  series 
of  differences  presents  irregularities,  I remove  these  by  inspection.” 

In  applying  the  graphic  method  to  select  or  analyzed  data  Dr. 
Sprague  first  formed  a short  mortality  table  covering  five  years 
from  entry  for  each  age  at  entry.  He  then  averaged  the  values 
of  l[x]+t  and  d[X]+t  for  quinquennial  intervals  of  [x],  thus  forming 
values  of  qw+t  for  quinquennial  values  of  [s].  After  preliminary 
elimination  of  great  irregularities  from  each  series  he  then  gradu- 
ated graphically  the  values  of 


Qlzh 


gr*]+i 

Q[x] 


and 


ff[s]+2 
g[*]+l  * 


The  values  of  2[X]+3  and  qw+4  were  then  filled  in  by  interpolation 
between  the  values  of  q^,  q[x]+i  and  qw+2  so  determined  and  the 
values  of  qx+ 5,  etc.,  in  the  ultimate  table  (see  J.  I.  A.,  Vol.  XXI, 
p.  229). 

16.  In  the  construction  of  the  Carlisle  Table,  Milne  used  a 
graphic  process  to  redistribute  into  the  individual  ages  the  popula- 
tion and  deaths  upon  which  the  table  was  based.  They  were 
originally  arranged  in  groups  by  quinquennial  intervals  up  to 
age  20  and  thereafter  by  decennial  intervals.  A separate  diagram 
was  made  for  each  by  constructing  a series  of  rectangles  such  that 
the  base  of  each  represented  an  age  interval  and  the  area  the 
number  in  the  corresponding  group.  A curve  was  then  drawn 
through  the  tops  of  these  rectangles  which  was  made  as  smooth 
as  possible  consistent  with  the  requirement,  not  observed  in  the 
ordinary  graphic  method,  that  the  area  between  the  curve  and 
the  base  should  be  the  same  for  each  interval  as  that  of  the  corre- 
sponding rectangle.  The  area  under  the  curve  within  the  interval 
corresponding  to  each  individual  year  of  life,  then  gave  the  re- 
distributed number  of  the  population  or  deaths  corresponding  to 
that  year.  From  the  figures  so  obtained  the  central  death  rates, 
and  from  them  the  rates  of  mortality  were  constructed  in  the 
usual  way.  (8) 

17.  In  applying  the  graphic  method  to  mortality  tables  and  to 
some  others  the  difficulty  is  found  that  if  the  scale  of  the  diagram 
is  sufficiently  large  to  permit  of  accurate  reading  in  one  part  of 
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the  curve  it  is  so  large  that  in  another  part  the  curve  crosses  the 
lines  representing  the  ordinates  at  a very  acute  angle,  thus  not 
only  increasing  the  difficulty  of  reading  but  multiplying  greatly 
the  effect  on  the  resulting  values  of  a slight  deviation  in  the 
curve.  This  difficulty  is  sometimes  met  by  using  different  scales 
in  the  different  sections  of  the  table,  but  where  a mathematical 
formula  can  be  determined  which  approximately  represents  the 
series  it  may  be  used  as  a basis,  either  by  representing  graphically  ? 
not  the  term  of  the  series  itself,  but  its  ratio  to  the  corresponding 
value  of  the  mathematical  function  or  by  the  process  explained 
below. 

18.  In  the  case  of  frequency  distributions  the  mathematical 
function  most  generally  useful  will  be  that  representing  the 

exponential  law  of  error,  viz.,  Let  then 


and 


where  y is  the  ungraduated  value  of  the  functions  corresponding 
to  the  value  x of  the  argument.  In  this  connection  it  is  to  be 
noted  that  tables  of  f(x)  — \ or  of  2 f(x)  — 1 are  to  be  found  in 
various  treatises  on  probability  and  that  the  determination  of  the 
values  of  F(x)  will  ordinarily  reduce  to  summation  as  our  knowl- 
edge of  the  value  of  y is  usually  in  the  form  of  a statement  of  the 

successive  values  of  I y dx.  For  each  value  of  £ a value  of  z 

is  then  determined  by  the  equation  f{z ) = F{x)  and  the  successive 
differences  of  z are  then  graduated  in  the  manner  already  described. 
From  these  graduated  differences  we  proceed  successively  to 
graduated  values  of  z,  f(z)  or  F(x)  and  A F(x)  which  is  the  graduated 
series  required.  Where  tables  of  f(x)  are  not  available  the  function 


ex 

1 +e* 


may  be  used  instead,  in  which  case  we  have  the  relation 


Y 

z = log  — y > where  N is  the  total  number  of  cases  and  Y the 

number  of  cases  for  values  of  the  argument  less  than  x.  (9)  (11) 
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19.  In  the  case  of  mortality  tables  we  may  put 
log  lx  = K — Ax  — ez 

so  that  we  have  z = log  ( K — Ax  — log  lx)  where  K and  A are 
so  selected  as  to  make  the  series  of  values  of  z in  a general  way 
arithmetic.  This  may  be  done  either  by  trial  or  from  the 
equations 

log  lx  + Ax  — K _ log  lx+t  + A(x  + t)  — K 

log  lx+t  + A (x  + t)  — K log  lx+zt  + A (x  + 2t)  — K 

_ log  lx+2t  + A(x  + 2t)  — K = log  lx  - log  lx+t  - At 
log  lx+zt  + A(x  + 3t)  - K log  lx+t  - log  lx+2t  - At 

= log  lx+t  ~ log  lx+2t  ~ At  = log  lx  - 2 log  lx+f  + log  lx+2t 
log  Z*+2«  - log  lx+u  - At  log  lx+t  — 2 log  lx+2t  - log  lx+3t  ’ 

where  x and  t are  given  suitable  values.  The  differences  of  z 
are  again  graduated  as  before  and  from  these  graduated  differences 
are  constructed  graduated  values  of  2,  log  lx,  log  px  and  qx. 


Comparison  of  Actual  and  Expected  Claims 


Age. 

Expected  Deaths. 

Actual  Deaths. 

Age. 

Expceted  Deaths. 

Actual  Deaths. 

55 

.0 

0 

80 

19.0 

19 

56 

.1 

0 

81 

20.4 

21 

57 

.3 

0 

82 

20.5 

23 

58 

.5 

0 

83 

21.4 

26 

59 

.8 

1 

84 

20.7 

26 

60 

1.4 

1 

85 

18.7 

23 

61 

1.8 

3 

86 

17.1 

21 

62 

2.4 

2 

87 

15.8 

16 

63 

3.1 

0 

88 

14.0 

12 

64 

3.9 

4 

89 

13.7 

15 

65 

4.5 

1 

90 

. 11.7 

9 

66 

5.2 

1 

91 

10.0 

7 

67 

6.3 

3 

92 

9.4 

6 

68 

7.0 

5 

93 

8.2 

7 

69 

7.8 

11 

94 

6.0 

2 

70 

8.4 

6 

95 

5.1 

3 

71 

9.6 

12 

96 

3.7 

4 

72 

10.2 

10 

97 

1.9 

1 

73 

11.4 

11 

98 

1.5 

2 

74 

12.7 

6 

99 

.5 

1 

75 

14.3 

16 

76 

15.1 

24 

77 

15.1 

8 

78 

17.1 

16 

79 

17.9 

13 

16 
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20.  In  graduating  the  mortality  experience  given  on  page  5 
the  graduated  0M(5)  table  which  is  constructed  by  a mathemat- 
ical formula  may  be  adopted  as  a basis.  The  table  on  page  15 
shows  the  actual  and  expected  deaths,  the  latter  being  calculated 
by  multiplying  the  number  exposed  by  the  rate  of  mortality 
according  to  the  0M(5)  table. 

The  data  may  then  be  grouped  in  order  to  reduce  the  irregu- 
larities as  follows: 


Ages. 

Average  Age. 

Expected. 

Actual. 

Percentage. 

55-63 

61.2 

10.4 

7 

67.3 

64-69 

66.9 

34.7 

25 

72.0 

70-74 

72.2 

52.3 

45 

86.0 

75-79 

77.1 

79.5 

77 

97.0 

80-84 

82.0 

102.0 

115 

112.7 

85-92 

88.0 

110.4 

109 

98.7 

93-99 

94.7 

26.9 

20 

74.3 

In  calculating  the  average  age  for  each  group,  each  age  is 
weighted  in  proportion  to  the  expected  deaths. 

21.  We  have  thus  seven  points  on  the  diagram  to  indicate  the 
general  course  of  the  curve.  They  indicate  a rapid  increase  up 
to  a maximum  somewhere  in  the  age  group  80-84  followed  by  a 
rapid  decrease.  Following  these  indications  a curve  is  then  drawn 
somewhat  freely,  especially  towards  the  extremities  where  the 
figures  involved  are  small.  The  percentage  corresponding  to  each 
age  is  then  read  off  and  the  graduated  value  of  qx  computed  by 
taking  this  percentage  of  the  corresponding  value  of  qx  according 
to  the  0M(5)  table.  The  new  expected  deaths  are  then  computed 
as  shown  in  the  table  on  page  17. 

22.  This  comparison  shows  expected  deaths  to  the  number  of 
398.5  compared  with  398  actual  deaths,  the  sum  of  the  positive 
deviations  being  46.2  and  that  of  the  negative  45.7.  The  total  of 
the  positive  accumulated  deviations  is  61.5  and  that  of  the  negative 
71.4  and  there  are  eight  changes  of  sign.  The  sum  of  the  indi- 
vidual deviations  irrespective  of  sign  is  91.9,  which  agrees  fairly 
well  with  the  theroetical  amount.  The  graduation  therefore 
satisfies  the  tests  regarding  agreement  with  the  original  data. 
An  examination  of  the  values  of  qx  shows  that  there  is  no  change  in 
the  sign  of  the  first  differences,  and  that  the  second  differences 
are  negative  for  only  ages  86  to  88  and  age  96.  Slight  adjustments 
were  made  by  inspection  above  age  90  to  eliminate  irregularities 
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Graduation  of  Data  by  Graphic  Method 


Age. 

Percentage. 

Graduated 

Qx- 

Expected 

Deaths. 

Actual 

Deaths. 

Deviation. 

Accumulated 

Deviation. 

55 

67.7 

.0141 

.0 

0 

.0 

.0 

56 

67.8 

.0151 

.1 

0 

+ 

.1 

+ 

.1 

57 

67.9 

.0161 

.2 

0 

+ 

.2 

+ 

.3 

58 

68.1 

.0173 

.3 

0 

+ 

.3 

+ 

.6 

59 

68.3 

.0186 

.5 

1 

.5 

+ 

.1 

60 

68.6 

.0200 

1.0 

1 

.0 

+ 

.1 

61 

68.9 

.0216 

1.2 

3 

— 

1.8 

1.7 

62 

69.3 

.0234 

1.7 

2 

— 

.3 

— 

2.0 

63 

69.6 

.0253 

2.2 

0 

+ 

2.2 

+ 

.2 

64 

69.9 

.0273 

2.7 

4 

1.3 

— 

1.1 

65 

70.3 

.0297 

3.2 

1 

+ 

2.2 

+ 

1.1 

66 

70.8 

.0322 

3.7 

1 

+ 

2.7 

+ 

3.8 

67 

71.5 

.0352 

4.5 

3 

+ 

1.5 

+ 

5.3 

68 

72.3 

.0384 

5.1 

5 

+ 

.1 

+ 

5.4 

69 

73.8 

.0424 

5.8 

11 

5.2 

+ 

.2 

70 

75.8 

.0471 

6.4 

6 

+ 

.4 

+ 

.6 

71 

78.8 

.0530 

7.6 

12 

— 

4.4 

— 

3.8 

72 

82.6 

.0602 

8.4 

10 

— 

1.6 

— 

5.4 

73 

87.2 

.0689 

9.9 

11 

— 

1.1 

— 

6.5 

74 

92.6 

.0792 

11.8 

6 

+ 

5.8 

— 

.7 

75 

97.5 

.0904 

13.9 

16 



2.1 



2.8 

76 

101.6 

.102 

15.3 

24 

— 

8.7 

-11.5 

77 

104.5 

.114 

15.8 

8 

+ 

7.8 

— 

3.7 

78 

106.9 

.126 

18.3 

16 

+ 

2.3 

— 

1.4 

79 

108.5 

.139 

19.4 

13 

+ 

6.4 

+ 

5.0 

80 

109.4 

.152 

20.8 

19 

+ 

1.8 

+ 

6.8 

81 

109.8 

.165 

22.4 

21 

+ 

1.4 

+ 

8.2 

82 

109.7 

.178 

22.5 

23 

.5 

+ 

7.7 

83 

108.7 

.191 

23.3 

26 

— 

2.7 

+ 

5.0 

84 

107.4 

.204 

22.2 

26 

— 

3.8 

+ 

1.2 

85 

105.6 

.217 

19.8 

23 



3.2 



2.0 

86 

103.3 

.230 

17.7 

21 

— 

3.3 

— 

5.3 

87 

100.8 

.242 

15.9 

16 

— 

.1 

— 

5.4 

88 

97.9 

.253 

13.7 

12 

+ 

1.7 

— 

3.7 

89 

94.0 

.262 

12.9 

15 

2.1 

— 

5.8 

90 

90.1 

.271 

10.5 

9 

+ 

1.5 



4.3 

91 

86.8 

.280 

8.7 

7 

+ 

1.7 

— 

2.6 

92 

83.8 

.290 

7.9 

6 

+ 

1.9 

— 

.7 

93 

81.3 

.302 

6.7 

7 

.3 

— 

1.0 

94 

79.3 

.317 

4.8 

2 

+ 

2.8 

+ 

1.8 

95 

78.0 

.335 

4.0 

3 

+ 

1.0 

+ 

2.8 

96 

77.0 

.353 

2.8 

4 

1.2 

+ 

1.6 

97 

76.2 

.368 

1.4 

1 

+ 

.4 

+ 

2.0 

98 

75.8 

.384 

1.1 

2 

.9 

+ 

1.1 

99 

75.6 

.403 

.4 

1 

— 

.6 

+ 

.5 

398.5 

398 

+46.2 

-45.7 

+61.5 

-71.4 
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arising  from  the  fact  that  the  values  of  qx  in  the  table  used  as  a 
basis  were  not  calculated  directly  from  the  formula  but  from  values 
of  lx  and  dx  which  were  recorded  to  only  a small  number  of  figures. 


Graduation  by  Interpolated  Series. 

23.  Milne’s  method  of  redistributing  the  population  and  deaths 
described  in  art.  16  amounts  in  effect  to  a graphic  interpolation 
performed  upon  the  function  formed  by  summing  continuously 
the  terms  to  be  redistributed.  The  method  of  interpolation  by 
means  of  a mathematical  formula  has  also  been  applied  in  various 
ways  to  the  graduation  of  mortality  tables  and  other  series. 

24.  In  the  construction  by  Farr  of  the  English  Life  Table  No.  3 
the  data  used  were  the  population  and  deaths  for  individual 
years  of  age  up  to  age  4 inclusive,  then  for  quinquennial  groups  to 
age  15  and  thereafter  for  decennial  groups.  The  rates  of  mortality 
for  ages  under  5 were  constructed  direct  from  the  data  and  re- 
tained unadjusted.  In  the  other  groups,  except  that  for  ages  10 
to  15,  the  assumption  was  made  that  the  total  deaths  divided 
by  the  total  population  would  give  the  force  of  mortality  at  the 
middle  of  the  group.  For  the  exceptional  group  a special  adjust- 
ment was  made,  which  was  based  on  the  result  of  an  analysis 
of  part  of  the  material.  This  adjustment  was  considered  necessary 
because  the  data  indicated  the  occurrence  of  a minimum  in  the 
middle  of  the  group.  (12) 

25.  The  force  of  mortality  so  calculated  for  ages  7\  and  12£ 
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were  taken  as  respectively  equal  to  m7  and  m 12  and  the  correspond- 

2 — yyb 

ing  rates  of  mortality  were  computed  by  the  formula  px  = ^ -r-  . 

In  the  case  of  the  ten-year  groups  this  method  gave  the  force  of 
mortality  for  integral  ages  instead  of  for  the  middle  of  the  year  of  age 
and  a special  artifice  was  adopted  to  obtain  the  rates  of  mortality 
corresponding  to  those  ages.  It  was  assumed  that  the  force  of 
mortality  increased  in  geometrical  progression,  so  that  the  following 
relation  held  px+t  — rlnx  where  r10  = txx+iolfix.  Then 


COlog  e Vx 


f 


Px+tdt 


loge  r 


Pxy 


or  transforming  into  common  logarithms 

^ , k\r  - 1) 

Colog  p,  = -Lq— -fe, 

where  k = Logio  e.  The  values  of  log  px,  for  ages  3,  4,  7,  12, 
and  decennially  thereafter,  so  obtained  were  used  as  the  basis  of 
third  difference  interpolation  after  dividing  the  table  into  sections 
for  this  purpose.  The  points  of  division  for  the  male  life  table 
were  taken  at  ages  7,  20  and  50.  (12) 

26.  It  has  been  pointed  out  that  the  assumption  that  the  ratio 
of  total  deaths  to  total  population  in  a group  represents  the  force 
of  mortality  at  the  middle  of  the  group  considerably  understates 
the  mortality  at  advanced  ages  and  if  adopted  in  connection  with 
a fife  insurance  company’s  experience  somewhat  overstates  it  at 
the  younger  ages.  Some  correction  is  therefore  necessary  before 
the  method  can  be  applied  with  safety  and  Mr.  G.  King  has  devised 
a method  which  incorporates  the  principle  of  interpolation  more 
fully  than  Farr’s  method  just  described.  This  method  may  be 
applied  either  to  census  returns  of  population  and  deaths  or  to  the 
experience  of  insured  lives.  (13) 

27.  The  first  step  of  the  process  is  to  arrange  the  population,  or 
the  exposed  to  risk,  and  the  deaths  into  five-year  groups.  In  the 
case  of  an  insurance  experience  there  is  ordinarily  no  difficulty  in 
doing  so,  as  the  figures  are  usually  available  for  each  year  of  age. 
The  points  of  division,  however,  are  not  necessarily  exact  multiples 
of  five  years  but  are  so  chosen  as  to  facilitate  the  subsequent  inter- 
polations. In  the  case  of  census  returns  the  figures  are  frequently 
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given  for  ten-year  groups  only  at  mature  ages  and  it  is  necessary 
to  divide  each  group  into  two.  This  is  done  by  forming  tables  of 
TXJ  the  total  population,  and  of  Xx,  the  total  deaths  at  age  x and 
over  for  the  values  of  x which  form  the  points  of  division  between 
the  groups  for  which  the  figures  are  given.  Values  of  these  func- 
tions for  the  middle  of  each  group  are  then  calculated  by  the 
formula,  in  which  A is  taken  over  an  interval  of  ten  years. 

Ux  = Ux- 15  + 1.5A Ux-15  + .375A 21/X_15  - .0625A3 U*-U 
= U x— § + .5A£7X_6  - .125A2Z7x_i5  - .0625A3 Ux-n 
— \(Ux—h  + Ux+b)  — yV(A  £7*4-5  — A£7X_15) 

= tf{9(£7x-5+  Ux+ 5)  — (£7x_i5  + Ux+ 15)}. 


For  the  oldest  group,  where  this  formula  could  not  be  applied,  the 
formula  used  was  in  effect  Ux  = \\  Ux-iz  — 4£7x_i0  -f  6£7x_s  + ^+5} 
where  t/x-io  is  the  interpolated  value  for  the  middle  of  the  preced- 
ing group.  In  these  formulas  Ux  may  be  considered  as  represent- 
ing Log  Tx  or  Log  Xx  as  the  case  may  be  instead  of  Tx  or  Xx  thus 
reducing  the  size  of  the  numbers  involved.  The  first  differences  of 
the  natural  numbers  corresponding  will  give  the  population  and 
deaths  in  five-year  groups. 

28.  The  next  step  is  to  calculate  the  redistributed  population  or 
exposed  to  risk  and  deaths  for  the  middle  year  of  each  group  of 
five.  Let  wx  represent  the  total  for  the  group  of  five  ages  beginning 
with  age  x of  either  of  these  numbers  and  let  yx  be  a function  such 
that  A yx  = wx,  where  A is  taken  over  a five-year  interval.  Then 
we  have 

Vx+n  = yx  + g Ayx  H ^ — A22/x_6  H Y50 — A V*~ 6 


whence 


+ 


n{n 2 — 25)  (n  — 10) 
15000 


A4?/x_io, 


yx+ 2 = Vx  + .4A yx  - .12A2yx_6  - .056A 3i/x_5  + .0224A4?/x_io, 
yx+ 3 = yx  + .6A yx  - .12A22/x_5  - .064A 3yx-b  + .0224A42/x_io, 
fyx+ 2.5  = -2A yx  - .008A32/x_5, 

= .2  wx  — .008A2^x_5, 


which  is  taken  as  the  redistributed  value  required. 


GRADUATION  OF  MORTALITY  AND  OTHER  TABLES. 


21 


29.  From  these  redistributed  values  the  corresponding  rates  of 
mortality,  or  central  death  rates  as  the  case  may  be,  are  calculated 
giving  values  of  the  function  proceeding  by  quinquennial  intervals. 
In  order  to  complete  the  table  at  the  oldest  ages  it  is  necessary  to 
assume  an  age  at  which  qx  becomes  unity  or  mx  is  equal  to  2. 
This  assumption  should  be  made  to  harmonize  with  the  run  of  the 
experience  as  that  limit  is  approached;  the  series  of  quinquennial 
values  is  completed  by  a third  difference  interpolation  based  on 
this  value  and  the  three  highest  reliable  values.  It  is  well  to 
fix  upon  the  limiting  age  in  advance  if  possible  and  to  so  arrange 
the  grouping  that  it  will  form  a term  in  the  series  of  quinquennial 
values.  For  example  if  103  is  the  age  selected  the  groups  will 
run  16-20,  21-25,  etc.,  while  if  102  is  the  age  the  groups  will  be 
15-19,  20-24,  etc.  (13)  (14)  (15) 

30.  The  rates  for  intermediate  ages  are  then  determined  by 
interpolation  on  these  values  of  qx  or  of  mx  or  on  some  function  of 
the  values,  such  as  log  (qx  + .1),  which  is  more  nearly  arithmetic, 
by  an  osculatory  interpolation  formula  which  passes  continuously 
through  the  successive  groups.  King  suggests  the  use  of  Karup’s 
formula  which  may  be  written 

. A , n(n  — 1)  A2  . n(n  — l)2  A9 
u^n  = Ux  + nAux  H A2ux-i  H ~ — - A2ux-i, 


where  n is  less  than  unity.  But  this  sometimes  gives  an  undulating 
curve  and  in  that  case  it  will  be  better  to  use  the  more  accurate 
formula, 

Wx+»  = ux  + nAux  ~f" - ^ (A ^ux  ^A^ux—i) 


+ - 1>("  - 2>  (Asu 


X—l 


|A  5ux-i). 


In  applying  this  formula  at  either  end  of  the  table  where  terms 
are  not  available  for  the  calculation  of  the  differences  required 
it  is  assumed  that  the  fifth  differences  that  cannot  be  computed 
vanish  and  the  other  differences  are  filled  in  consistently  with  that 
assumption.  A practical  method  of  applying  this  formula  is 
given  in  the  Transactions  of  the  Actuarial  Society  (Vol.  IX.,  page 
211). 

31.  In  applying  Mr.  King’s  method  to  the  experience  sum- 
marized in  Art.  6 we  have  the  following  totals  by  five-year  age 
groups : 
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Ages. 

Exposed. 

Died. 

55-59 

67 

1 

60-64 

362 

10 

65-69 

617 

21 

70-74 

711 

45 

75-79 

728 

77 

80-84 

634 

115 

85-89 

337 

87 

90-94 

134 

31 

95-99 

28 

11 

32.  It  is  evident  that  the  number  of  deaths  in  each  group  is  not 
yet  sufficiently  large  to  give  a regular  progression  of  rates  of 
mortality.  We  accordingly  seek  a formula  of  greater  weight 
than  that  used  by  Mr.  King.  With  the  notation  used  above  we 
have: 

2/x+io  = Vx  + 2A yx  + A22/x_5  -f  A3t/x_5, 

2/x—  5 ==  Vx  A?/x  ~f"  A22/z_5. 

Whence 

yx+ 10  - yx-b  = 3A yx  + A 3yx-b, 

also 

yx+ib  = yx  + 3At/x  + 3A2?/x_5  + 4A3?/x_5  + A4i/x_io, 

2/x-io  = yx  - 2a yx  + 3A2yx_5  - A3t/x_5  + A42/x_i0. 

Whence 

2/x+is  - 2/x-io  = 5Ayx  + 5A 3t/x_5; 

therefore 

65(2/x+io  - 2/x_5)  - 14(2/x+i5  - 2/x-io)  = 125A?/X  - 5A37/X_5  = '(6255?/x+2.5. 
But 

2/x+lO  - yX—b  = Wx- 5 + ^ + Wx+5 

and 

2/x+15  — 2/x-io  = Wx— 10  + Wx_5  + Wx  -f-  Wx+5  + Wx+io- 

So  that 

6258yz+2.b  = 51(wx_5  + wx  + wx+5)  - 14(wx_i0  + wx+i0). 

33.  Applying  this  formula  to  the  above  figures  we  obtain  the 
results  shown  below,  where  Ex  and  0X,  respectively,  represent  Ex- 
posed to  Risk  and  Deaths  between  ages  x and  x + 1. 

The  value  of  q 97  so  obtained,  however,  is  evidently  unreliable 
and  is  rejected,  a new  value  being  calculated  by  interpolation,  on 
the  assumption  that  #102  is  unity  and  fourth  differences  vanish. 
The  formula  is 

$102  — 4^97  + 6$92  — 4^87  + $82  = A4#82  = 0 
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or 

4$97  = $102  + 6$92  “ 4$87  + <?82- 

The  value  so  obtained  is  .5111.  The  value  of  $57  is  also  un- 
reliable, but  is  not  seriously  inconsistent  with  the  other  values 
and  is  accordingly  retained  for  the  purpose  of  completing  the  table 
so  as  to  permit  of  a complete  comparison  of  actual  with  expected 
deaths. 


Age. 

625 Ex. 

625 0X. 

57 

12,941 

267 

.0206 

62 

43,392 

1,002 

.0231 

67 

75,060 

2,784 

.0371 

72 

88,912 

5,543 

.0623 

77 

92,371 

10,575 

.1145 

82 

74,819 

13,165 

.1765 

87 

45,771 

10,651 

.2327 

92 

16,573 

4,969 

.2998 

97 

3,549 

884 

.2491 

34.  We  have  therefore  as  a basis  of  interpolation  the  following 
values  and  differences : 


X 

104?*. 

KHA  qx. 

lO^A *qx. 

104A3?x-5. 

57 

206 

25 

115 

62 

231 

140 

112 

- 3 

67 

371 

252 

270 

158 

72 

623 

522 

98 

-172 

77 

1145 

620 

-58 

-156 

82 

1765 

562 

109 

167 

87 

2327 

671 

1442 

1333 

92 

2998 

2113 

2776 

1334 

97 

5111 

4889 

102 

10000 

Interpolating  then  by  Karup’s  formula  and  continuing  down  to 
age  55  on  the  assumption  of  constant  second  differences,  we  obtain 
the  rates  of  mortality  shown  in  the  table  on  page  24. 


Summation  Formulas. 

35.  Before  entering  upon  a discussion  of  graduation  by  summa- 
tion formulas  it  will  be  well  to  explain  a system  of  notation  covering 
the  various  operations  involved  and  to  show  in  what  way  the 
principle  of  the  separation  of  symbols  will  apply  to  the  symbols 
adopted  to  represent  these  operations.  The  fundamental  opera- 
tion involved,  in  addition  to  the  elementary  ones  of  addition,  sub- 
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Graduation  of  Data  by  Interpolation 


X. 

qx. 

Expected 

Deaths. 

Actual 

Deaths. 

Deviation. 

Accumulated 

Deviation. 

55 

.0228 

.0 

0 

.0 

.0 

56 

.0215 

.1 

0 

+ 

.1 

+ .1 

57 

.0206 

.2 

0 

+ 

.2 

+ .3 

58 

.0202 

.4 

0 

+ 

.4 

+ .7 

59 

.0202 

.6 

1 

— 

.4 

+ .3 

60 

.0207 

1.0 

1 

.0 

+ .3 

61 

.0217 

1.3 

3 

— 

1.7 

- 1.4 

62 

.0231 

1.5 

2 

— 

.5 

- 1.9 

63 

.0250 

2.1 

0 

+ 

2.1 

+ .2 

64 

.0273 

2.7 

4 

— 

1.3 

- 1.1 

65 

.0301 

3.2 

1 

+ 

2.2 

+ 1.1 

66 

.0334 

3.8 

1 

+ 

2.8 

+ 3.9 

67 

.0371 

4.8 

3 

+ 

1.8 

+ 5.7 

68 

.0410 

5.4 

5 

+ 

.4 

+ 6.1 

69 

.0451 

6.1 

11 

— 

4.9 

+ 1.2 

70 

.0497 

6.7 

6] 

+ 

.7 

+ 1.9 

71 

.0554 

7.9 

12 

— 

4.1 

- 2.2 

72 

.0623 

8.7 

10 

— 

1.3 

- 3.5 

73 

.0709 

10.2 

11 

— 

.8 

- 4.3 

74 

.0808 

12.0 

6 

+ 

6.0 

+ 1.7 

75 

.0916 

14.1 

16 

— 

1.9 

- .2 

76 

.103 

15.5 

24 

— 

8.5 

- 8.7 

77 

.115 

16.0 

8 

+ 

8.0 

- .7 

78 

.126 

18.3 

16 

+ 

2.3 

+ 1.6 

79 

.139 

19.5 

13 

+ 

6.5 

+ 8.1 

80 

.152 

20.8 

19 

+ 

1.8 

+ 9.9 

81 

.164 

22.3 

21 

+ 

1.3 

+ 11.2 

82 

.177 

22.3 

23 

— 

.7 

+ 10.5 

83 

.188 

23.7 

26 

— 

2.3 

+ 8.2 

84 

.199 

21.7 

26 

— 

4.3 

+ 3.9 

85 

.210 

19.1 

23 

_ 

3.9 

- .0 

86 

.221 

17.0 

21 

— 

4.0 

- 4.0 

87 

.233 

10.6 

16 

— 

5.4 

- 9.4 

88 

.243 

13.1 

12 

+ 

1.1 

- 8.3 

89 

.252 

12.3 

15 

— 

2.7 

-11.0 

90 

.262 

10.2 

9 

+ 

1.2 

- 9.8 

91 

.277 

8.6 

7 

+ 

1.6 

- 8.2 

92 

.300 

8.1 

6 

+ 

2.1 

- 6.1 

93 

.328 

7.2 

7 

+ 

.2 

- 5.9 

94 

.361 

5.4 

2 

+ 

3.4 

- 2.5 

95 

.401 

4.8 

3 

+ 

1.8 

- .7 

96 

.449 

3.6 

4 

— 

.4 

- 1.1 

97 

.511 

2.0 

1 

+ 

1.0 

- .1 

98 

.587 

1.8 

2 

— 

.2 

- .3 

99 

.675 

.7 

1 

— 

.3 

- .6 

397.4 

398 

+49.0 

+76.9 

-49.6 

-92.0 
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traction,  multiplication  and  division,  is  that  of  proceeding  to  the 
next  term  of  the  series.  Let  us  designate  this  operation  by  writing 
E before  the  function  operated  upon.  Then  we  have  EUX  — Ux+1; 
E2  Ux  = EE  UX  = E Ux+ 1 = XJx+ 2 and  generally  En  Ux  = Ux+n-  It  is  then 
evident  that  EmEnUx  = Ux+Tn+n  = Em+nUx  so  that  the  exponential 
law  applies  in  the  same  way  to  E as  if  it  were  an  ordinary  quantity. 
We  have  also  E(UX  + Vx)  = Ux+ 1 + Vx+i  = EUX  + EVX  so  that 
the  distributive  law  applies.  Also  EaUx  = aUx+ 1 = aEUx,  where 
a is  a quantity,  so  that  the  commutative  law  applies  to  the  opera- 
tion in  combination  with  ordinary  quantities.  Also  we  may  extend 
the  definition  and  say  that  generally 

(E  cl) Ux  = EU x ~f"  aU x — U x+ 1 -f-  dU x — aUx  -f-  Ux+ 1 

= (a  + E)UX. 

With  this  understanding  it  is  evident  that  the  operator  E may  be 
separated  from  the  function  upon  which  it  operates  and  treated 
in  algebraic  transformations  as  if  it  were  an  ordinary  quantity. 
It  is  to  be  carefully  observed  that  the  function  operated  upon  must 
not  enter  into  the  transformation.  In  other  words  the  quantities 
entering  into  the  transformations  must  be  constants  and  not 
functions  subject  to  the  operator.  From  the  above  reasoning  it 
is  evident  that  any  other  operator  which  may  be  expressed  in 
terms  of  E and  constants  will  possess  the  same  property.  Coming 
under  this  general  class  we  have  the  following  operations : 


A = 

E - 

- 1, 

or 

AUx  ■- 

= Ux+ 1 

- Vx, 

D = 

T,  Eh~  1 _ 

L/th=0  ^ — 

log  eE,  or 

DUX  = 

- Lth=  o 

U x-\-h 
h 

Ux_ 

dUx 
dx  7 

5 = 

E112 

- E~112, 

or 

Oo 

Cl 

H 

II 

= U x+l/  2 

i - Ux- 

•1/2, 

R = 

(En 

1 

Eq 

1 

or 

[n}Ux  - 

= U x+  (n/ 

2)~  Ux 

-L.nl  2), 

In  = 

(En 

+ s-”), 

or 

Y nU x z 

= ~U  x+n 

+ Ux-n, 

M = 


Enl2  — E~n  12 
£1/2  _ £-l7 2 


M 

5 ’ 


or  [n]Ux  — Ux+Ln-l)/2  + U x-\-(n—Z)  I 2 

+ • ' ' Ux-(.n- 1)/2» 


From  the  above  expansion  it  will  be  seen  that  [n]Ux  is  the  sum 
of  n terms  with  respect  to  which  XJx  is  centrally  situated.  It  will 
be  noted  that  where  n is  an  even  number  Ux  itself  does  not 
appear  in  the  summation. 
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36.  For  the  purpose  of  demonstrating  the  possibility  of  separat- 
ing the  symbols  of  operation  it  was  convenient  to  start  with  E as 
the  fundamental  operation  and  to  express  the  others  in  terms  of  it, 
but  for  our  future  purposes  it  will  be  convenient  to  express  the 
various  operators  in  terms  of  ascending  powers  of  D. 

We  have 

D = loge  E 


or 


E 


1+1>+T+T+fI+ 


which  is  readily  seen  to  follow  directly  from  Taylor's  series,  since 


EIL 


tt  — tj  \ dZJx  1 d2Ux  _ 

Ux+i  Ux  + dx  + 2 da?  + 6 dx 3 


1 d*Ux  , 1 d*Ux 


+ ~ 


= (1  + D + iD2  + ID*  + + 

D5 


24  dx t 
)UX  = eDUx 


8 = eD/2—  e~D/2 


D3 

D 4-  — — 4- 
^ 24  ^ 1920 


+ 


*-^-l+»D  + ^ + ^ + ^ + ^ + 


6 


24 


120 


773  7)3  9?^ 

l«]  = £”'2-£-n,2  = ^ + ^ + ii)  + 


7„  = En  + = 2-4-  n2Z)2  + 


n4D4 

12 


+ 


M = 

= 71  + 


nD  + n3Z)3/24  + w5Z>5/ 1920  + 
Z>  + Z)3/24  + Z)5/1920  + • • 

n(n2 


1)  rc{5(n2-l)2-2(tt4-l)| 

24  ^ + 5760 D + 


37.  An  examination  of  King’s  method  already  described  will 
show  that  in  determining  the  rates  of  mortality  to  be  used  as  a 
basis  for  the  interpolated  series,  he  uses  adjusted  values  of  the 
exposed  and  of  the  deaths.  These  adjusted  values  are  obtained 
by  grouping  and  redistributing  according  to  the  principles  of 
finite  differences  but  the  final  formula  for  the  adjusted  value  of 
Ux  expressed  in  the  above  notation  is 

UJ  =mw,-  *[5]’[5]IM  = f {1  - *[5 f)Ux 

-^(H-AnKT. 
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since  \n]2  is  seen  from  its  definition  to  be  equal  to  7 „ — 2.  The 
process  followed  in  this  preliminary  adjustment  partakes  therefore 
of  the  nature  of  a summation  formula.  Expanding  the  operators 
in  this  last  expression  in  powers  of  D to  the  fifth  inclusive  we  have 

(l  + D2  + KZ)4+...)(l-^2“^4 ) = (1--V£>4 ). 

38.  The  first  summation  method  employed  was  that  devised  by 
Mr.  Woolhouse.  It  was  originally  arrived  at  by  a process  of 
interpolation.  Each  of  the  five  sets  of  quinquennial  values  was 
made  the  basis  of  a second  difference  interpolation  and  the  inter- 
mediate values  computed.  There  were  thus  obtained  for  each 
point  five  values,  one  of  which  was  the  original  value  and  the 
remaining  four  interpolated  values.  The  average  of  these  five 
values  was  taken  as  the  graduated  value  of  the  function.  The 
general  formula  for  such  an  interpolation  may  be  written 

jj  _ (x  - 6)0  - c)  TJ  (x  - a)(x  - c)  Tr  (x  - a)(x  - fe)  ... 

1 ( a - b)(a  - c)  a ^ (b  - a)(b  -c)Ub~t  (c  - a){c  - b)  c' 

The  expressions  for  these  five  values  are  then  as  follows : 


(1) 

— T5 

U x~~7 

+ 

2 1 7T 

+ 

7 

T5 

Ux+3, 

(2) 

— A 

Ux-  6 

+ 

UUx-i 

+ 

3 

HIT 

U x+4> 

(3) 

Ux, 

(4) 

A 

Ux-  4 

+ 

24  TT 
2TUx+1 

— 

2 

2T 

Ux+6j 

(5) 

A 

Ux-  3 

+ 

2 1 TJ  XX 
T5Ux+ 2 

— 

3 

2T 

Ux+ 7. 

The  average  of  the  five  values  then  takes  the  following  form, 
omitting  x from  the  subscript: 

U0'  = rb{25C/0  + 24(17!  + U- 1)  + 21(C7*  + t/_2)+7(t/3+  C/_3) 

+ 3(C7 4 + U- 4)  - (2 (Ue  + tf-e)  “ 3 (U7  + C/_7)} 
= + 24yi  + 2I72  + 7T3  -f  374  — 276  — 377}  Uo.  (16) 

39.  It  was  later  discovered,  however,  that  a graduation  by 
Woolhouse’s  formula  could  be  effected  by  means  of  a summation 
process.  If  we  designate  by  Gw  the  operation  of  graduating  by 
this  formula,  we  have 

125GWU0  = 25C70  + 24(^  + 17- 1)  + 21(17*  + U- 2) 

+ 7 (17*  + 17-,)  + 3(C74  + C/_ 4)  - 2 (U6  + U-6)  - 3 (C/7  + t/_7). 
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But  it  can  readily  be  seen  by  actual  expansion  that 
[5]3  U0  = (£2  + E + 1 + E~'  + E->yUo 

= 19U0  + 18(C7i  + 1)  + 15  (t/2  + t7_2)  + 10  (t/3  + Z/_3) 

+ 6(1/4  + 17-  4)  + 3 (Ub  + C/_5)  + (J76  + ff-e). 

So  that 

125GWU0  = [5]3C/0  + 6(17-2  + £/_,  + U0  + Ux  + t/2) 

- 3(C/_7  + 17-6  + 17-5  + 17-4  + 17-s) 

- 3(173  + C/4  + U6  + U6  + 17r) 

= [5]317o  - 3[5]2[5]Z/0  = [5]3C70  - 3S2[5]3*70 
= [5]3(1  - 352)17.o  = [5]3{  10  - 3[3 ]}U0, 
since  [3]  = 3 + 52.  Thus 

Gw  = rk[5]3(l  - 3 S2)  = t^[5]3{10  - 3[3]|. 

It  will  be  noticed  that  the  formula  calls  for  a final  division  by 
125,  but  the  number  of  figures  involved  in  the  summations  can 
be  reduced  by  dividing  by  10  after  the  first  summation  in  fives 
so  that  the  working  formula  becomes  A[5]2TV[5]  { 10  — 3[3]}.  (17) 
(18) 

40.  The  schedule  on  page  29  shows  the  actual  work  of  applying 
this  formula  to  the  graduation  of  a series  of  values  of  qx. 

41.  Expanding  Gw  in  ascending  powers  of  D to  the  fourth 
inclusive  we  have 

Gw  = tM5]3(1  - 352)  = t2t[5]3{10  - 3[3]} 

= xk(5  + 5Z)2  + «D4  + • • -)3(1  - 3Z)2  - \D* ) 

= (1  + Z)2  + UD*  + ‘ • *)3(1  ~ 3D2  - ID4 ) 

= (1  + 3D2  + HZ)4  + • • 0(1  - 3Z)2  - iZ)4 ) 

= (1  - 5-4Z)4-  • •)• 

This  shows  that  a series  such  that  the  fourth  and  higher  differ- 
ential coefficients  vanish  will  be  exactly  reproduced  and  that  a 
series  for  which  those  coefficients  are  relatively  small  will  be 
approximately  reproduced.  The  assumption  underlying  a gradua- 
tion by  such  a formula  as  this  is  therefore  that  within  the  range 
covered  by  the  formula  the  curve  representing  the  true  values  of 
the  function  may  be  represented  by  a parabola  of  the  third  order. 
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Table  Illustrating  Woolhouse’s  Formula 


*. 

(l) 

10 

[3ft). 

(3) 

3(2). 

(4) 

10  (1 )— (3). 

(5) 

* [5]  (4). 

(6) 

[5]  (5). 

(7) 

[5]  (6). 

23 

511 

24 

633 

1,830 

5,490 

840 

25 

686 

1,963 

5,889 

971 

26 

644 

1,932 

5,796 

644 

332 

27 

602 

1,711 

5,133 

887 

266 

28 

465 

1,556 

4,668 

-18 

300 

1,267 

29 

489 

1,573 

4,719 

171 

225 

1,305 

30 

619 

1,625 

4,875 

1,315 

144 

1,420 

7,065 

565 

31 

517 

1,757 

5,271 

-101 

370 

1,444 

7,704 

616 

32 

621 

2,047 

6,141 

69 

381 

1,629 

8,281 

662 

33 

909 

2,283 

6,849 

2,241 

324 

1,906 

8,724 

698 

34 

753 

2,414 

7,242 

288 

410 

1,882 

9,329 

746 

35 

752 

2,258 

6,774 

746 

421 

1,863 

9,825 

786 

36 

753 

2,258 

6,774 

756 

346 

2,049 

10,086 

807 

37 

753 

2,451 

7,353 

177 

362 

2,125 

10,528 

842 

38 

945 

2,653 

7,959 

1,491 

510 

2,167 

11,229 

898 

39 

955 

3,035 

9,105 

445 

486 

2,324 

11,745 

940 

40 

1,135 

3,038 

9,114 

2,236 

463 

2,564 

12,306 

984 

41 

948 

2,991 

8,973 

507 

503 

2,565 

42 

908 

3,042 

9,126 

- 46 

602 

2,686 

43 

1,186 

3,324 

9,972 

1,888 

511 

44 

1,230 

3,623 

10,869 

1,431 

607 

45 

1,207 

3,579 

10,737 

1,333 

46 

1,142 

3,320 

9,960 

1,460 

47 

971 

42.  Mr.  Higham  who  devised  this  method  of  applying  the  for- 
mula suggested  a modification  which  effects  a marked  improvement 
in  the  results.  This  consisted  in  substituting  [3]  — 72  or  2 [3]  — [5] 
for  1 — 352.  Designating  his  graduation  by  Gh  we  have 

Ok  = t2t[5]3(  [3]  - 72! 

= (1  + 3D2  + HD4  + • • .)(1  - 3D2  - fD4 ) 

= (1  - 6.4D4  - •••)• 

The  table  on  page  30  shows  the  practical  application  of  this 
formula  to  the  series  to  which  Woolhouse’s  was  applied.  (17) 

43.  Each  term  Ux  of  the  series  to  which  a summation  formula 
is  applied  may  be  considered  as  made  up  of  two  parts  Vx  the  true 
value  of  the  function  which  would  be  arrived  at  on  a sufficiently 
broad  experience  and  Ex  the  error  or  departure  from  that  value 
so  that  we  have  Ux  = Vx  + Ex.  Using  G then  as  the  general 
symbol  for  a graduation  to  which  the  distributive  law  applies  we 
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Table  Illustrating  Higham’s  Formula 


X. 

(l) 

io6?*. 

(2) 

[3](1). 

(3) 

Y2(l). 

(4) 

(2)-(3). 

(5) 

;*[5](4). 

[5](5). 

!*$>. 

(8) 

*(7). 

22 

559 

23 

511 

1,703 

24 

633 

1,830 

1,203 

627 

25 

686 

1,963 

1,113 

850 

26 

644 

1,932 

1,098 

834 

314 

27 

602 

1,711 

1,175 

536 

297 

28 

465 

1,556 

1,263 

293 

266 

1,327 

29 

489 

1,573 

1,119 

454 

218 

1,317 

30 

619 

1,625 

1,086 

539 

232 

1,383 

7,168 

573 

31 

517 

1,757 

1,398 

3j59 

304 

1,485 

7,660 

613 

32 

621 

2,047 

1,372 

675 

363 

1,656 

8,228 

658 

33 

909 

2,283 

1,269 

1,014 

368 

1,819 

8,767 

701 

34 

753 

2,414 

1,374 

1,040 

389 

1,885 

9,295 

744 

35 

752 

2,258 

1,662 

596 

395 

1,922 

9,751 

780 

36 

753 

2,258 

1,698 

560 

370 

2,013 

10,130 

810 

37 

753 

2,451 

1,707 

744 

400 

2,112 

10,595 

848 

38 

945 

2,653 

1,888 

765 

459 

2,198 

11,168 

893 

39 

955 

3,035 

1,701 

1,334 

488 

2,350 

11,760 

941 

40 

1,135 

3,038 

1,853 

1,185 

481 

2,495 

12,317 

985 

41 

948 

2,991 

2,141 

850 

522 

2,605 

42 

908 

3,042 

2,365 

677 

545 

2,669 

43 

1,186 

3,324 

2,155 

1,169 

569 

44 

1,230 

3,623 

2,050 

1,573 

552 

45 

1,207 

3,579 

2,157 

1,422 

46 

1,142 

3,320 

2,642 

678 

47 

971 

3,525 

48 

1,412 

have  GUX  = G(VX  + Ex)  = GVX  + GEX.  The  effect  of  the 
graduation  therefore  consists  of  two  parts,  one  being  its  effect 
on  the  smooth  series  of  true  values  and  the  other  its  effect  on  the 
errors.  As  regards  the  first  it  is  required  that  the  series  shall  be 
reproduced  as  nearly  as  possible  and  as  regards  the  second  it  is 
required  that  the  errors  shall  be  reduced  as  far  as  possible  and  that 
the  residual  errors  shall  form  a continuous  series.  The  assumption 
is  usually  made  with  respect  to  Vx  that  fourth  and  higher  differ- 
ences may  be  neglected.  Summation  formulas  are  therefore 
generally  arranged  so  that  the  coefficient  of  D2  either  vanishes  or 
is  very  small,  a value  of  1/12  of  the  opposite  sign  to  the  coefficient 
of  Z>4  being  sometimes  permitted. 

44.  A summation  process  of  graduation  usually  consists  of 
three  summations,  not  necessarily  over  equal  intervals,  following 
a preliminary  adjustment  and  followed  by  a division  by  the  number 
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necessary  to  reduce  the  function  to  the  original  scale.  The  general 
expression  for  this  operation  may  therefore  be  written. 


G = {1  + 2(a  + 6 + c)  - ayi  - by2  - cr3| 

But  we  have  generally 

M-  i n2-l  5(n»-l)»-2(n«-l)jy 
n ^ 24  r 5760  "r 


so  that  to  the  third  power  of  D inclusive  we  have 

0 - (‘  + T2”)  (*  (‘ 


= 1 + 


{1  - (a + 46.+  9c)  D2 


p2  + q2  + r2  — 3 
24 


(a  + 46  + 9c)  }l>2. 


In  order  therefore  that  the  formula  may  be  correct  to  third 
differences  we  must  have 

a + 46  + 9C— ^ + g2  + r2-3. 


In  order  that  the  graduated  values  obtained  shall  correspond  to 
integral  values  of  the  variable  it  is  also  necessary  that  p + q + r 
should  be  an  odd  number,  so  that  p , q and  r must  be  all  odd  num- 
bers or  two  even  and  one  odd.  In  Woolhouse’s  formula  p,  q and  r 
are  each  equal  to  5,  and  a is  equal  to  3,  b and  c being  each  zero, 
so  that  the  relation  holds.  In  Higham’s  formula  a is  equal  to  — 1 
and  b is  equal  to  1,  so  that  a + 46  = 3 and  the  relation  holds. 
In  Hardy's  modification  of  Higham's  formula  ([4][5][6])/120  is 
substituted  for  [5]3/125  so  that  we  have 


G = 1 + 


16  + 25  + 36  -3 
24 


"31 


D2 


= 1 + 


TJ1 


showing  a second  difference  error  in  this  case.  In  Karup’s  19-term 
formula  the  summations  are  in  fives  and  the  values  of  a,  b and  c 
are  — f , 0 and  f respectively,  so  that  a + 46  + 9c  = — f = 3, 

and  the  relation  holds.  The  expression  for  this  formula  is  then 

G = jg(|  + fri  - = lM  (3[3J  - 2t3}. 


In  Spencer's  21 -term  formula  two  summations  in  fives  and  one 
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in  sevens  are  used  and  (p2  + q2  + r2  — 3)/24  becomes 

25  + 25  + 49  -3 

24  4> 


also  a — — I,  b = 0 and  c — \ so  that  a + 46  + 9c  = f — § =4, 
and  the  formula  is  correct  to  third  differences.  This  formula 
may  be  expressed  symbolically  as  follows: 


[5f[7] 

350 


{1  + [3]  ->3}. 


The  following  table  shows  the  method  given  by  Spencer  for 
applying  the  formula  to  the  graduation  of  a table  of  qx : (19) 

Table  Illustrating  Spencer’s  Formula 


X. 

(i) 

10  fy*. 

*(i). 

(3) 

[3]  (2). 

(4) 

78(2). 

(5) 

(2)+(3)-(4). 

m(6(5). 

(7) 

*(6).i 

(8) 

[5]  (7). 

0) 

tU5](8). 

20 

569 

81 

21 

235 

34 

195 

22 

559 

80 

187 

23 

511 

73 

243 

173 

143 

24 

633 

90 

261 

120 

231 

25 

686 

98 

280 

146 

232 

26 

644 

92 

276 

143 

225 

1,212 

242 

27 

602 

88 

244 

178 

152 

1,173 

235 

28 

465 

66 

222 

172 

116 

1,093 

219 

1,129 

29 

489 

70 

224 

181 

113 

1,066 

213 

1,131 

30 

619 

88 

232 

216 

104 

1,102 

220 

1,158 

592 

31 

517 

74 

251 

174 

151 

1,221 

244 

1,212 

617 

32 

621 

89 

293 

177 

205 

1,311 

262 

1,289 

652 

33 

909 

130 

327 

196 

261 

1,363 

273 

1,383 

693 

34 

753 

108 

345 

182 

271 

1,448 

290 

1,478 

735 

35 

752 

107 

323 

224 

206 

1,569 

314 

1,566 

777 

36 

753 

108 

323 

266 

165 

1,695 

339 

1,639 

817 

37 

753 

108 

351 

270 

189 

1,752 

350 

1,705 

856 

38 

945 

135 

379 

242 

272 

1,732 

346 

1,778 

896 

39 

955 

136 

433 

238 

331 

1,782 

356 

1,872 

938 

40 

1,135 

162 

433 

277 

318 

1,936 

387 

1,971 

979 

41 

948 

135 

427 

311 

251 

2,166 

433 

2,054 

42 

908 

130 

434 

308 

256 

2,245 

449 

2,114 

43 

1,186 

169 

475 

325 

319 

2,146 

429 

44 

1,230 

176 

517 

274 

419 

2,081 

416 

45 

1,207 

172 

511 

332 

351 

46 

1,142 

163 

474 

405 

232 

47 

971 

139 

504 

390 

253 

48 

1,412 

202 

577 

49 

1,654 

236 

652 

50 

1,498 

1 214 
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45.  We  have  hitherto  confined  our  attention  largely  to  the 
effect  of  the  graduation  on  Vx  and  we  have  seen  that  a great 
variety  of  formulas  may  be  devised  which  will  reproduce,  or 
practically  reproduce  a curve  of  the  third  degree.  In  order  to 
obtain  a complete  view  of  these  formulas  we  must  now  consider 
their  effect  on  Ex.  In  this  investigation  we  will  assume  that  the 
function  to  be  graduated  is  such  that  the  successive  values  of  Ex 
are  independent  of  one  another.  We  will  also  assume  that  the 
mean  value  of  the  square  of  Ex  is  the  same  for  each  term.  This 
is  not  strictly  true  but  the  effect  of  the  assumption  will  be  rela- 
tively much  the  same  in  the  different  formulas.  It  has  been 
already  stated  that  a graduation  is  expected  to  reduce  as  far  as 
possible  the  values  of  Ex  and  to  convert  what  remains  of  them 
into  as  smooth  a series  as  possible.  We  will  therefore  investigate 
the  mean  value  of  the  square  of  the  graduated  value  of  Ex  and 
the  mean  value  of  the  square  of  the  third  difference  of  that  value. 

46.  In  order  to  arrive  at  the  effect  on  the  magnitude  of  Ex  it  is 
necessary  to  expand  the  graduation  formula  in  terms  of  E.  Any 
formula  may  be  expressed  as  follows : 

G — do  -T  d\(E  -f-  E~ l)  T-  clz{E2  E~2)  -j-  • • • 

= a0  + UiTi  + ^272  + * * • • 

So  that  we  have 

GEX  = clqEx  + ai(Ex+i  + Ex- 1)  + a2(I?x+2  + Ex- 2)  + • • • • 

And,  since  the  values  of  Ex  are  independent  of  one  another  and 
the  mean  value  of  each  is  zero,  the  mean  value  of  the  square 
of  GEX  is,  if  we  write  /t2  for  the  mean  value  of  the  square  of  EXf 
+ 2ai2  + 2^  + • • •)•  The  reciprocal  of  the  value  of 
(a02  + 2ai2  + 2a22  + • • • ) therefore  represents  what  may  be  called 
the  weight  of  the  graduation  formula  or  its  effect  in  reducing 
individual  errors.  No  general  relation  can,  however,  be  deduced 
between  the  summations  involved  and  the  weight  of  the  resulting 
formula  except  that  it  may  be  seen  that  the  longer  the  interval 
over  which  the  summations  extend  the  greater  will  be  the  weight 
and  that  successive  summations  have  a progressively  decreasing 
effect. 

47.  In  investigating  the  effect  on  the  magnitude  of  the  third 
difference  it  is  necessary  to  expand  83G  in  terms  of  E,  since  5 is 
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a symbol  of  differencing.  But  we  have,  where  H designates  the 
preliminary  operation, 

i’c?  = s 3 frlMM  h = — [vWiWW 

pqr  pqr 

= _L(^p/2  _ _ #-(r/i))# 

= -J—  (j£(;P+«+r)  12  _ ^(p+fl-r)  12  _ ^<2+r-p)  /2  _ _g(r+p-«)  /2 

pqr 

-\-E~ [(p+9-r)  /2]_|_j!£-[((z+r-p)  /2]_{__g-[(r+p-«)  /2]  __£>-[  (p+s+r)  /2J)## 

lip  — q = r — n this  becomes 

a3G  = 4 {E3«/2  - 3£n/2  + 3E~inll)  - E~itnl2)\H. 

In  this  case  if  the  range  of  terms  included  in  # when  expanded 
in  terms  of  E does  not  exceed  n,  or  if  there  are  no  two  significant 
terms  in  H separated  by  an  interval  which  is  a multiple  of  n> 
that  is,  if  there  are  no  like  terms  to  be  brought  together  after 
multiplying  out,  the  sum  of  the  squares  of  the  coefficients  of  the 
powers  of  E in  bzG  will  be  exactly  20/n6  times  the  sum  of  the 
squares  of  the  coefficients  in  H.  For  example,  in  Woolhouse’s 
formula, 

H = - ZE  + 7 - 3 E-1 

VGW  = Th(&*  ~ ZE2*  + 3 E-2*  - E~^)(-  3E  + 7 - 3#-1) 

= tW  - 3 E*l  + 7 E^  - 3 E<%  + 9 E^  - 21#2*  + 9 m - 9 E~^ 
+ 21F~2*  - 9 E~^  + 3 E-**  - 7 E~^  + 3 E~**}. 

The  sum  of  the  squares  of  the  coefficients  in  this  expansion  is 
readily  seen  to  be  equal  to 

1 90 

^(1  + 9 + 9 + 1)(32  + 72  + 32)  = (32  + 72  + 32). 

As  the  sum  of  the  squares  of  the  coefficients  in  the  expansion  of 
the  expression  for  the  third  difference  of  the  original  series  is  20, 
the  effect  of  a graduation  on  the  smoothness  of  the  series  is  usually 
measured  by  a smoothing  coefficient  which  is  computed  by  dividing 
the  sum  of  the  squares  of  the  coefficients  in  the  expansion  of  the 
expression  for  the  third  difference  of  the  graduated  series  by  20 
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and  extracting  the  square  root  of  the  quotient^  Thus  the  smooth- 
ing coefficient  in  Woolhouse’s  graduation  is  a/67/125  or  about  1/15. 
In  Higham’s  formula  the  coefficient  similarly  is  V5/125  or  about 
1/56.  (21)  (25) 

48.  It  will  be  seen  from  the  general  expression  for  the  third 
difference  of  the  graduated  series  that  in  general  the  following 
relations  hold. 

(а)  The  smoothing  coefficient  will  decrease  as  the  product  pqr 
increases. 

(б)  The  smaller  the  sum  of  the  squares  Of  the  coefficients  in  H 
the  smaller  will  be  the  smoothing  coefficient. 

(c)  The  substitution  of  unequal  summations  for  equal  ones 
will  tend  to  reduce  the  coefficient  although  this  tendency  will  be 
somewhat  checked  by  the  fact  that  for  a formula  of  given  range 
the  product  pqr  will  be  smaller  for  unequal  summations  than 
for  equal. 

It  has  been  shown  (see  Transactions  Actuarial  Society,  Vol. 
XVII,  p.  43)  that  the  minimum  smoothing  coefficient  in  a summa- 
tion formula  covering  2n  + 1 terms  and  expressed  in  the  general 
form  G = a0  + «i7i  + cl 272  + • • • an7n  is  obtained  when  ax  takes  the 
form  ( h -\-jx2){(n  + l)2  — x2}  {( n + 2)2  — x2}  {(n  + 3)2  — x2}  and 
h and  j are  so  determined  that  ao  + 2 ai  + 2 a2  + • • • 2a„  = 1 and 
ai  + 4<x2  + • • • n2an  = 0.  This  reduces  to  simpler  form  when  we 
put  m — n + 2 so  that  the  number  of  terms  covered  by  the 
formula  is  2m  — 3.  It  may  then  be  shown  that  the  general  ex- 
pression for  ax  is 

_ 315 {(m—  l)2  — x2}  \m2— x2}  {(m+  l)2—x2}  { (3m2— 16)  — 1 la;2} 

8 m(m2  — 1)  (4m2  — 1)  (4m2  — 9)  (4m2  — 25) 


49.  In  accordance  with  the  above  principles  it  is  found  that 
Hardy’s  modification  of  Higham’s  formula  reduced  the  smoothing 
coefficient  to  1/95  by  substituting  unequal  for  equal  summations. 
The  smoothing  coefficient  in  Karup’s  19-term  formula  is  1/106 
due  to  the  small  coefficients  in  H,  and  the  coefficient  in  Spencer’s 
21-term  formula  is  1/160  due  to  all  three  causes.  If  still  greater 
smoothing  power  is  required  than  is  obtained  in  Spencer’s  formula, 
we  may  extend  the  number  of  terms  to  27  and  use  the  following: 


G 


[5][7][11] 

385 


{[3]  -7.}. 
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This  formula  is  due  to  Mr.  Kenchington.  It  is  easily  applied, 
is  correct  to  third  differences  and  its  smoothing  coefficient  is  1/326. 
(24) 

50.  The  way  in  which  these  theoretical  results  are  verified  in 
actual  experience  may  be  tested  by  differencing  the  section  of  a 
table  which  we  have  graduated  by  Woolhouse’s,  Higham’s  and 
Spencer’s  formulas.  We  find  that  the  sum  of  the  absolute  values, 
irrespective  of  sign,  of  the  eight  third  differences  of  105<?x,  is  for 
Woolhouse’s  formula  149,  for  Higham’s  43  and  for  Spencer’s  19. 
It  will  be  noticed  that  this  sum  is  not  reduced  by  the  more  power- 
ful formulas  in  the  full  proportion  called  for  by  the  theory.  This 
may  be  partly  an  accidental  variation  due  to  the  small  number  of 
terms,  but  there  is  a natural  explanation  from  the  fact  that  there 
is  a residual  irregularity  due  to  the  dropping  of  fractions,  which  no 
formula  would  eliminate.  The  mean  value  of  the  third  difference 
error  arising  from  this  source  is  in  fact  approximately  unity,  so 
that  the  probable  value  of  the  sum  irrespective  of  sign  of  eight 
such  differences  would  be  approximately  8. 

51.  Let  us  return  now  to  the  effect  of  the  graduation  on  Vx,  the 
regular  part  of  the  series,  and  determine,  for  the  different  formulas 
we  have  mentioned,  the  error  introduced  in  a fifth  difference  curve. 

For  Woolhouse’s  formula  we  have 

G = ^ {10  — 3[3])  = (1  + D2  + UD'Yi  1 - 3 Z)2  - | Z>4) 

= (1  + 3D2  + UD*)(  1 - 3D2  - \DV) 

= 1 - 5 • 4Z>4. 

For  Higham’s  formula 

G = (1  + 3D2  + UD*)(l  - 3D2  - £Z)4)  = 1 - 6-4 £>4. 

For  Hardy’s  formula 

G = (l  + fZ)2  + + Z)2  + «z>4)(i  + M£>2  + tW>) 

X (1  - 3Z)2  - I Z)4) 

= 1 + t jD2  — 6 7 1 1)\ 

For  Karup’s  formula 

G = (1  + 3D2  + U D4)(  1 - 3D2  - ffD4)  = 1 - 7-8 D\ 
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For  Spencer’s  21-term  formula 

G = (1  + Z)2  + HZ>4)2(1  + 2D2  + |D4)(1  - 4Z)2  - 3iZ>4) 

= (1  + 4Z>2  + 6 HZ)4)(1  - 4D2  - 3 il>4) 

= 1 - 12 -6Z)4. 

For  Kenchington’s  27-term  formula 

G = (1  + D2  + H£>4)(1  + 2Z)2  + £Z)4)(1  + 5D5  + 7*D4) 

X (1  - 8 £2  - 6fZ)4)  ' 

= (1  + 8Z)2  + 251fZ)4)  (1  - 8D2  - 6fZ)4) 

= 1 - 44  • 8D4. 

We  thus  see  that  in  general  with  an  increase  in  the  smoothing 
power  of  a formula  there  is  combined  an  increase  in  the  error  and 
in  particular  that  in  going  from  Spencer’s  21-term  formula  to  the 
27-term  formula,  we  obtain  a reduction  in  the  smoothing  coefficient 
from  1/160  to  1/326  at  the  expense  of  an  increase  in  the  error 
from  12.6D4  to  44.8D4.  With  so  large  a coefficient  as  44.8  the 
error  becomes  appreciable  in  the  case  of  some  functions  to  which 
the  formula  might  be  applied.  (20)  (21)  (22)  (23) 

52.  As  in  the  case  of  graphic  graduation  we  may  largely  reduce 
this  error,  either  by  using  a regular  series  calculated  by  a mathe- 
matical formula  as  a basis  and  graduating  either  the  difference  or 
the  ratio  of  the  two  series  or  by  a transformation  similar  to  that 
described  in  art.  17-19. 

53.  A difficulty  met  with  in  graduating  a series  by  a summation 
formula  arises  from  the  fact  that  in  order  to  determine  any  given 
graduated  value  it  is  necessary  to  have  ungraduated  values  over 
a considerable  interval  on  both  sides  of  the  required  value.  There 
is  therefore  an  interval  at  each  end  of  the  series,  for  which  the 
graduated  values  cannot  be  obtained  by  the  formula.  There  is  an 
apparent  exception  to  this  in  the  case  of  a frequency  distribution 
in  which  the  terms  gradually  diminish  toward  the  ends,  and  finally 
vanish,  but  it  must  be  remembered  in  such  a case  that  the  series  is 
really  infinitely  extended,  the  ungraduated  values  beyond  the 
limits  being  zero.  This  case  also  arises  in  applying  such  a formula 
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to  the  lx  or  dx  column  of  a mortality  table  since  the  values  beyond 
the  limiting  age  are  all  zero.  In  other  cases,  however,  such  as 
when  the  values  of  qx  or  mx  are  being  graduated,  it  is  necessary 
to  adopt  some  device  to  complete  the  table.  This  is  more  espe- 
cially necessary  at  the  older  ages,  but  is  sometimes  necessary  at  the 
young  ages  also.  Various  devices  have  been  adopted  to  accomplish 
this  object.  One  consists  in  making  a hypothetical  extension  of 
the  table  to  the  extent  necessary  to  permit  of  its  completion.  In 
the  case  of  the  qx  or  mx  column  at  the  older  ages  this  extension 
may  consist  of  a series  of  numbers  increasing  in  geometrical  pro- 
gression with  a common  ratio  approaching  1*1.  At' the  younger 
ages  a large  enough  group  of  ages  is  sometimes  taken  to  obtain  a 
fair  number  of  deaths  and  the  series  extended  by  assuming  this 
rate  to  be  constant  for  all  younger  ages.  Another  device  is  to 
take  the  last  three  graduated  values  given  by  the  formula  as  a 
basis  and  to  extend  the  table  as  a third  difference  series  through 
these  three  values  with  a third  difference  so  determined  that  the 
expected  deaths  for  the  ages  in  question  will  equal  the  actual. 
At  the  younger  ages  the  series  sometimes  becomes  very  irregular 
on  account  of  insufficient  data  before  it  absolutely  disappears, 
and  in  that  case  it  may  be  advisable  to  ignore  even  some  of  the 
graduated  values  that  can  theoretically  be  obtained  and  apply  the 
above  method  over  the  larger  interval. 

This  difficulty  also  arises  in  any  endeavor  to  graduate  by  this 
method  select  or  analyzed  mortality  tables  accentuated  in  this 
case  by  the  rapid  changes  in  the  values  of  q[x-\+t  for  changes  in  t 
where  t is  small.  A device  which  has  been  used  with  some  success 
for  this  purpose  is  to  make  a preliminary  adjustment  of  the  first- 
year  and  ultimate  mortality.  Then  assume 

3l x-n+t  = qx  — f(t)(qx  - gw), 

where  qx  is  the  rate  of  mortality  for  attained  age  x in  the  ultimate 
table  and  q[X ] is  the  first-year  rate  for  the  same  age.  Determine 
now  values  of  f(t)  which  will  reproduce  the  aggregate  deaths  of 
the  second,  third,  fourth  and  fifths  years  and  substitute  a smooth 
series  of  values,  remembering  that/(0)  = 1 and  that  f(t)  vanishes 
when  the  ultimate  stage  is  reached.  Then  for  each  attained  age 
determine  modified  ungraduated  values  of  qx  and  q [*]  from  the 
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equations 

(=0)  t—<o 

JZ  Etx-a+t  • #[*-*]+«•  3* 

*=0  <=0 

<=co  *=» 

— - qw)  = 2-)0[x-t]+f, 


<=0 


*=0 


f=0> 


z/m  *-t]+r  q[x-t]+t  — 

*=o 

- Zf(f)2E[z-ti+r(q*  - 2m)  = 2/(<)9m-h- 


t=  0 


The  values  of  qx  and  g[x]  are  then  each  graduated  by  a summation 
formula  and  the  values  of  q[x\+i}  #[x]+2,  etc.,  filled  in  from  the 
equation 

2[*-<]+i  = 2*  ~ /(<)( 2.  ~ 2m)- 


The  values  of  should  first  be  graduated  and  then  for  the  purpose 
of  extending  q[x\  at  the  older  ages  as  a preliminary  to  graduation 
an  average  percentage  of  the  ultimate  rates  based  on  a fairly  broad 
group  may  be  used. 

54.  As  the  Mortality  Table  given  on  page  5 is  very  irregular 
great  smoothing  power  is  desirable  in  any  formula  applied  to  it. 
Let  us  therefore  use  Mr.  Kenchington’s  27-term  formula  given 
in  Art.  49.  We  find  by  grouping  the  experience  up  to  age  63  inclu- 
sive that  the  actual  deaths  are  67.3  per  cent,  of  the  expected  by 
the  0*(5)  table.  We  accordingly  insert  67.3  per  cent,  of  the 
0*(5)  rates  of  mortality  at  all  ages  from  63  down  to  42  inclusive 
in  order  to  obtain  graduated  values  down  to  age  55.  The  following 
table  shows  the  process  of  graduating  the  values  of  qx. 


Graduation  of  Data  by  Summation 


X. 

(i) 

103?*. 

(2) 

[3]  (1). 

(3) 

*8(1). 

(4) 

(2)— (3). 

(5) 

[U]  (11). 

(6) 

[7]  (5). 

(7) 

[5]  (6). 

!>• 

O O ^ 

CO  Sh  II 

eo 

42 

7 

43 

7 

44 

8 

45 

8 

24 

16 

8 

46 

8 

25 

17 

8 

47 

9 

26 

18 

8 

48 

9 

28 

19 

9 

49 

10 

29 

20 

9 

40 
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Graduation  of  Data  by  Summation 


i(8) 

X. 

(l) 

io*s,. 

riu). 

(3) 

Ys(l). 

(4) 

(2)-(3). 

(5) 

[li]  (11). 

(6) 

[7]  (5) 

(7) 

[5]  (6). 

3857103  G) 

= qL- 

50 

10 

31 

21 

10 

Ill 

51 

11 

33 

22 

11 

118 

52 

12 

35 

24 

11 

126 

53 

12 

37 

25 

12 

134 

928 

54 

13 

39 

27 

12 

142 

992 

55 

14 

42 

29 

13 

151 

1,096 

5,525 

.0144 

56 

15 

45 

30 

15 

146 

1,209 

5,903 

.0153 

57 

16 

48 

32 

16 

175 

1,300 

6,185 

.0161 

58 

17 

51 

35 

16 

222 

1,306 

6,369 

.0165 

59 

18 

54 

37 

17 

239 

1,274 

6,484 

.0168 

60 

19 

58 

40 

18 

225 

1,280 

6,598 

.0171 

61 

21 

62 

57 

5 

148 

1,324 

6,882 

.0179 

62 

22 

67 

27 

40 

119 

1,414 

7,429 

.0193 

63 

24 

86 

28 

58 

152 

1,590 

8,221 

.0214 

64 

40 

73 

44 

29 

219 

1,821 

9,269 

.0241 

65 

9 

58 

60 

- 2 

312 

2,072 

10,588 

.0275 

66 

9 

41 

105 

- 64 

415 

2,372 

12,180 

.0316 

67 

23 

70 

84 

- 14 

456 

2,733 

14,110 

.0366 

68 

38 

142 

93 

49 

399 

3,182 

16,272 

.0423 

69 

81 

163 

80 

83 

419 

3,751 

18,526 

.0481 

70 

44 

209 

99 

110 

513 

4,234 

20,907 

.0543 

71 

84 

199 

78 

121 

668 

4,626 

23,324 

.0606 

72 

71 

231 

185 

46 

881 

5,114 

25,639 

.0666 

73 

76 

187 

204 

- 17 

898 

5,599 

27,953 

.0726 

74 

40 

220 

142 

78 

848 

6,066 

30,463 

.0791 

75 

104 

304 

181 

123 

887 

6,548 

33,338 

.0866 

76 

160 

322 

169 

153 

904 

7,136 

36,764 

.0955 

77 

58 

328 

179 

149 

980 

7,989 

40,834 

.106 

78 

110 

261 

258 

3 

1,150 

9,025 

45,658 

.119 

79 

93 

342 

343 

- 1 

1,469 

10,136 

51,315 

.133 

80 

139 

386 

264 

122 

1,751 

11,372 

57,574 

.150 

81 

154 

476 

349 

127 

1,884 

12,793 

64,135 

.167 

82 

183 

543 

346 

197 

1,998 

14,248 

70,724 

.184 

83 

206 

628 

412 

216 

2,140 

15,586 

76,828 

.200 

84 

239 

698 

396 

302 

2,401 

16,725 

82,157 

.213 

85 

253 

765 

405 

360 

2,605 

17,476 

86,227 

.224 

86 

273 

768 

512 

256 

2,807 

18,122 

88,216 

.229 

87 

242 

737 

470 

267 

2,890 

18,318 

88 

222 

770 

479 

291 

2,635 

17,575 

89 

306 

759 

495 

264 

2,644 

90 

231 

763 

560 

203 

2,336 

91 

226 

679 

355 

324 

1,658 

92 

222 

766 

556 

210 

93 

318 

673 

731 

- 58 

94 

133 

701 

476 

225 

95 

250 

883 

889 

- 6 

96 

500 

1,000 

1,318 

-318 

97 

250 

1,417 

98 

667 

1,917 

99 

1,000 
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Comparison  of  Actual  and  Expected  Deaths 


X. 

Qx • 

Expected 

Deaths. 

Actual 

Deaths. 

Deviation. 

Accumulated 

Deviation. 

55 

.0144 

.0 

0 

.0 

.0 

56 

.0153 

.1 

0 

+ 

.1 

+ .1 

57 

.0161 

.2 

0 

+ 

.2 

+ .3 

58 

.0165 

.3 

0 

+ 

.3 

+ .6 

59 

.0168 

.5 

1 

.5 

+ .1 

60 

.0171 

.8 

1 



.2 

- .1 

61 

.0179 

1.0 

3 

— 

2.0 

- 2.1 

62 

.0193 

1.4 

2 

— 

.6 

- 2.7 

63 

.0214 

1.8 

0 

+ 

1.8 

- .9 

64 

.0241 

2.4 

4 

1.6 

- 2.5 

65 

.0275 

2.9 

1 

+ 

1.9 

- .6 

66 

.0316 

3.8 

1 

+ 

2.6 

+ 2.0 

67 

.0366 

4.7 

3 

+ 

1.7 

+ 3.7 

68 

.0423 

5.6 

5 

+ 

.6 

+ 4.3 

69 

.0481 

6.5 

11 

— 

4.5 

- .2 

70 

.0543 

7.3 

6 

+ 

1.3 

+ 1.1 

71 

.0606 

8.7 

12 

3.3 

- 2.2 

72 

.0666 

9.3 

10 

— 

.7 

- 2.9 

73 

.0726 

10.5 

11 

— 

.5 

- 3.4 

74 

.0791 

11.8 

6 

+ 

5.8 

+ 2.4 

75 

.0866 

13.3 

16 



2.7 

- .3 

76 

.0955 

14.3 

24 

— 

9.7 

-10.0 

77 

.106 

14.7 

8 

+ 

6.7 

- 3.3 

78 

.119 

17.3 

16 

+ 

1.3 

- 2.0 

79 

.133 

18.6 

13 

+ 

5.6 

+ 3.6 

80 

.150 

20.6 

19 

+ 

1.6 

+ 5.2 

81 

.167 

22.7 

21 

+ 

1.7 

+ 6.9 

82 

.184 

23.2 

23 

+ 

.2 

+ 7.1 

83 

.200 

25.2 

26 

— 

.8 

+ 6.3 

84 

.213 

23.2 

26 

— 

2.8 

+ 3.5 

85 

.224 

20.4 

23 



2.6 

+ .9 

86 

.233 

17.9 

21 

— 

3.1 

- 2.2 

87 

.242 

16.0 

16 

0.0 

- 2.2 

88 

.249 

13.4 

12 

+ 

1.4 

- .8 

89 

.256 

12.5 

15 

2.5 

- 3.3 

90 

.263 

10.3 

9, 

+ 

1.3 

- 2.0 

91 

.270 

8.4 

7 

+ 

1.4 

- .6 

92 

.278 

7.5 

6 

+ 

1.5 

+ .9 

93 

.287 

6.3 

7 

.7 

+ .2 

94 

.298 

4.5 

2 

+ 

2.5 

+ 2.7 

95 

.311 

3.7 

3 

+ 

.7 

+ 3.4 

96 

.326 

2.6 

4 

1.4 

+ 2.0 

97 

.344 

1.4 

1 

+ 

.4 

+ 2.4 

98 

.365 

1.1 

2 

.9 

+ 1.5 

99 

.389 

.4 

398.9 

1 

398 

- .6 

+42.6 

-41.7 

+ .9 

+62.1 

-44.3 

42 
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55.  This  process  gives  graduated  values  of  qx  up  to  age  86 
inclusive  only  and  the  value  for  age  86  is  considerably  affected  by 
the  scanty  and  irregular  data  at  the  extreme  old  ages.  We 
therefore  ignore  this  value  and  taking  the  values  of  qx  for  ages  83, 
84  and  85  as  a basis  assume  that  the  values  for  older  ages  form  a 
third  difference  series,  of  which  the  general  term  is 

283+n  = .200  + .013 n - .002^”  ~ ^ . 

The  value  of  y is  then  determined  so  as  to  make  the  expected 
deaths  for  ages  86  and  over  equal  to  the  actual.  The  required 
value  of  y is  found  to  be  .0003952  and  the  rates  of  mortality  from 
age  86  to  age  99  are  inserted  on  this  basis.  The  table  on  page  41 
shows  the  comparison  of  the  actual  with  the  expected  deaths. 

Graduation  by  Mathematical  Formula. 

(a)  Frequency  Distributions. 

56.  If  a variable  may  have  any  value  within  certain  limits,  the 
curve  in  which  the  ordinate  y is  proportional  to  the  chance  of 
the  variable  having  the  value  x represented  by  the  abscissa,  is 
called  the  curve  of  frequency.  For  example,  let  the  variable 
be  the  length  of  a human  life  which  may  have  any  value  from  zero 
to  the  limits  of  the  mortality  table.  The  chance  of  the  duration 
falling  between  x and  x + dx  is  evidently  (lx  — lx+dx)/lo  or 
IxVxdJl o so  that  the  equation  of  the  frequency-curve  in  this  case 
is  y — Ixfj'x- 

Similarly,  if  two  points  be  taken  at  random  on  a straight  line; 
the  curve  of  frequency  of  the  distance  of  the  furthest  of  the  two 
from  a specified  end  is  represented  by  the  straight  line  y = x 
between  the  limits  0 and  a where  a is  the  length  of  the  line. 

57.  Where  in  any  two  cases  the  variables  are  comparable  quanti- 
ties, such  as  the  heights  of  individuals  in  two  different  nations  or 
the  durations  of  life  in  two  different  groups  of  individuals,  it  is 
convenient  to  have  some  short  method  of  comparison  between  the 
two;  some  coefficient  or  factor  which  will  indicate  whether  one 
curve  or  the  other  falls,  on  the  whole,  on  the  higher  values  of  the 
variable,  and  if  so  by  how  much. 

The  quantity  most  frequently  used  for  this  purpose  is  the  mean 
value  of  the  variable  obtained  by  multiplying  each  value  by  its 
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probability  and  summing.  In  other  words,  if  mi  be  the  mean 
value  and  a and  b the  limits  of  the  curve  y — Ux  we  have 


It  is  evident  from  the  above  method  that  if  in  one  curve  the 
value  of  mi  is  greater  than  in  the  other,  the  former  falls  on  the 
average  on  higher  values  of  the  variable.  It  is  also  evident  that 
represented  geometrically,  mi  is  the  abscissa  of  the  center  of 
position  of  the  area  included  between  the  curve  and  the  base. 

Other  functions  which  have  been  used  for  this  purpose  are  the 
median  and  the  mode.  The  median  is  that  value  of  the  variable 
which  it  is  as  likely  as  not  to  exceed.  Its  value  h is  determined 

I ydx  = I ydx,  where  a is  the  lower  and  b 

a *Jh 

the  upper  limit  of  the  value  of  x.  The  mode  is  the  value  of  the 
variable  whose  probability  is  the  greatest.  For  instance:  in  the 
case  where  the  curve  is  represented  by  the  equation  y = f(t) 
= lx+tHx+t,  t being  the  variable,  we  see  that  the  mean  value  of  t 
is  the  complete  expectation  of  life  at  age  x,  that  the  median  value 
is  what  is  known  as  the  vie  probable  or  equation  of  life,  and  that 
the  mode  corresponds  to  the  most  probable  after-lifetime. 

58.  When  the  mean  value  of  the  variable  has  been  determined, 
the  next  question  is,  “How  closely  do  the  values  of  the  variable 
cluster  about  this  mean  value  or  how  widely  dispersed  are  they?” 
Several  methods  might  be  proposed  of  measuring  the  degree  of 
dispersion,  but  the  most  natural  one  in  connection  with  the  mean 
is  the  mean  square  of  the  departure  designated  by  /z2  when  the 
departure  is  measured  from  the  mean.  (1)  (27)  (28)  (30)  (31) 

The  mean  square  of  the  departure  from  any  given  value  is  known 
as  the  second  moment  about  that  value.  The  value  of  the  second 
moment  m2  about  any  given  origin  can  be  readily  expressed  in 
terms  of  mi  the  mean  value  of  the  variable,  and  ju2  as  follows: 

Designating  the  operation  of  taking  the  mean  value  by  writing 
M in  front  of  the  expression,  we  have 

M2  = M(x  — mi)2  = Mx2  — 2miMx  + mi2  = ra2  — 2mi2  + mi 


44 


GRADUATION  OF  MORTALITY  AND  OTHER  TABLES. 


or 

^2  = M2  + ^l2. 

It  is  thus  evident  that  the  second  moment  about  any  other  value 
of  the  variable  is  greater  than  that  about  the  mean  value.  In 
other  words,  taking  the  mean  value  as  point  of  reference  makes  the 
second  moment  a minimum.  It  thus  appears  that  the  second 
moment  has  a natural  connection  with  the  mean  value  of  the 
variable.  The  mean  absolute  departure,  departure  in  either  direc- 
tion being  considered  positive  has  a similar  connection  with  the 
median. 

Other  measures  in  terms  of  M2  are  sometimes  substituted  for 
it  in  order  to  express  the  dispersion  as  a linear  magnitude.  The 
measures  most  frequently  so  used  in  connection  with  frequency- 
curves  in  general  are  the  standard  deviation  and  the  modulus. 
The  standard  deviation,  commonly  denoted  by  a is  a quantity 
whose  square  is  equal  to  the  mean  square  of  departure.  In  other 
words  a 2 = M2-  The  modulus,  sometimes  denoted  by  c,  is  deter- 
mined by  the  equation  c2  = 2 /jl2  and  the  name  is  derived  by  analogy 
from  the  normal  exponential  frequency-curve  whose  equation  is 
y = &e~(x2/c2)  in  the  case  of  which  curve  M2  = c2/2. 

59.  Having  determined  the  mean  value  of  the  variable  and  the 
degree  of  dispersion  from  that  mean  value,  the  next  question  is 
whether  the  various  possible  values  of  the  variable  are  dispersed 
symmetrically  about  the  mean  value  or  whether  the  curve  is 
heaped  up  on  one  side  and  drawn  out  on  the  other.  And  as  the 
first  moment  necessarily  vanishes  and  the  second  moment  can 
give  us  no  information  on  the  subject  because  departures  in  both 
directions  enter  into  it  positively,  we  are  forced  to  look  to  the 
third  moment,  or  the  mean  value  of  the  cube  of  the  departure, 
for  a criterion.  It  is  evident  that  if  the  curve  is  symmetrical,  each 
positive  departure  will  be  balanced  by  a corresponding  negative 
one,  and  so  the  mean  value  of  the  cube  will  vanish.  It  is  thus 
evident  that  a value  of  the  third  moment,  other  than  zero,  is  an 
indication  of  a lack  of  symmetry.  Denoting  by  M3  the  third 
moment  about  the  mean  and  by  m3  the  corresponding  moment 
about  any  other  point  taken  as  origin,  we  have 

M3  = M(x  — mi)3  = Mx3  — 3miMx2  + 3mi2M  x — mi3 
= m3  — 3mim2  + 2mi3  = m3  — 3miM2  — mi3 
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or 

m%  = M3  + 3wijU2  + mi3. 

Of  course,  the  curve  is  not  necessarily  absolutely  symmetrical  if 
/jlz  vanishes,  but  any  marked  lack  of  symmetry  would  likely  show 
itself  in  the  value  of  /*3.  The  value  of  /z3  is  usually  taken  as  a 
measure  of  the  skewness  or  lack  of  symmetry,  being  divided  by  c3 
in  order  that  the  measure  may  be  always  numerical,  and  the 
quotient  being  designated  by  j so  that  we  have  j = /*3/c3.  Another 
function  entering  into  the  theory  of  curves  of  frequency  is  the 
quotient  of  the  square  of  the  third  moment  by  the  cube  of  the 
second  moment  and  denoted  by  ft  so  that  we  have  ft  = 

But  we  have  j2  — ju 32/c6  = /x32/8/x23,  so  that  ft  = 8 j2. 

60.  Similarly,  further  information  in  relation  to  the  curve  can 
be  secured  by  determining  the  moments  of  higher  order,  but  we 
shall  in  this  investigation  only  take  into  account  the  fourth 
moment  m or  the  mean  value  of  the  fourth  power  of  the  departure 
from  the  mean,  and  the  quantity  ft  determined  from  the  equation 
ft  = /Wm  22. 

The  value  of  m 4,  the  fourth  moment  about  any  other  point  as 
origin,  may  be  readily  seen  to  be  connected  with  that  of  im  by  the 
following  relation 

JU4  = M\  — 4mim3  + 6mi2m2  — 3mi4 

or 

= M4  + 4wi  ju3  + 6mi2jU2  + mi4. 

61.  When  it  is  desired  to  graduate  a frequency  distribution  by 
means  of  a mathematical  formula,  the  constants  in  the  formula 
are  ordinarily  so  determined  as  to  make  the  area  of  the  frequency 
curve  and  the  moments,  or  mean  values  of  the  various  powers  of 
the  variable,  so  far  as  possible  the  same  as  in  the  ungraduated 
distribution.  It  is  therefore  necessary  to  determine  the  moments 
in  the  ungraduated  series  and  also  to  determine  the  relation 
between  the  moments  and  the  arbitrary  constants  in  the  sub- 
stituted series. 

62.  In  calculating  the  moments  in  an  ungraduated  series,  we 
generally  meet  with  the  difficulty  that  the  exact  values  of  the 
various  items  are  not  given,  but  we  have  the  total  number  of  cases 
faffing  within  certain  limits.  It  is  customary  in  such  cases  to 
calculate  the  moments  on  the  assumption  that  the  cases  falling 
within  each  interval  are  concentrated  at  the  middle  of  that  interval 
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and  to  make  a correction  to  allow  for  the  actual  distribution.  This 
correction  for  the  case  where  the  curve  of  frequency  has  close 
contact  with  the  base  at  both  limits  may  be  derived  as  follows : 

Let  XJ»  designate  the  value  of  the  ordinate  of  the  curve  of 
frequency  corresponding  to  the  value  x of  the  abscissa  and  let  7X 
represent  the  area  intercepted  between  the  ordinates  corresponding 

r+h 

to  the  values  x and  x + so  that  Vx  is  equal  to  I Ux+hdh. 
Then  from  Taylor’s  series  for  Ux+h  we  have 


Vx  = Ux  + ~ 


1 d2Ux 


+ 


1 d*Ua 


whence 


24  dx2  ' 1920  dx* 


a2F  , _ + A *L- + 

1-1  dx 2 + 12  dx*  + 


+ 


d?Vx  . 1 d4U, 


and 


whence 


A47x_2  = 


dx2 

d*Vx 
dx*  '+ 


+ 


8 dx* 


+ 


==d*Ux, 
dx*  ^ 


Ux  — V£  — it  4 A2  7 x—i  + -g-|-0-A47x_2  — 


Now  mn  = f xnUxdx  = 2xnUx,  if  xnUx  and  its  derived  functions 
vanish  at  the  limits  of  integration,  whence 


m. 


2x»(7x  - AA27X_!  + efir  A47x_2 ). 


But  generally 
2xnA  2mVx-m 

Therefore 


22 


2m 


m + x — y m + y — x 


2VyA2m(y  — m)n. 


(-  l)x~yxnVv 


mn  = 27x{z"  - -hA2(x  - 1)”  + ^A*(x  - 2 )» }. 

But 

A2(x  - 1).  = «(n  - l)x"-2  + «("-l)("-2)("-3)^  + . . .. 
and 

A*(x  — 2)n  — n{n  — l)(n  — 2 ){n  — 3)xn~*  + • • •. 
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Therefore 

f n(n  — 1) 
mn  = 2Vx(xn ^4 — Lxn~ 


+ 7n(n  - l)(n  - 2)(n  - 3)  ^n_4 


5760 


, n(n  — 1)  , 

— 771  « 171  n— 2 


7n(n  - l)(n  - 2)(n  - 3) 

+ 5760 m "~4  “ 


where  uncorrected  moments  are  designated  by  accented  letters. 
Putting  n successively  equal  to  1,  2,  3 and  4 we  have 

mi  = nil, 
m2  = m2  — TV, 
m3  = m3'  — Jra/, 
m4  = rrii  - im*  + -zh, 
or  if  the  mean  value  is  taken  as  origin 

M2  = V2  — Tt> 

M3  = M3', 

M4  = M/  — ^M27  + = M/  — §M2  ~ TO'* 

63.  This  investigation  shows  that  we  may  either  calculate  the 
moments  from  the  totals  of  the  groups  and  then  apply  the  adjust- 
ment afterwards,  or  we  may  calculate  the  true  central  ordinates 
by  the  formula 

Ux  — V x — -zs&PV  x-i  + 6_to'A4T  x—2  etc, 

and  then,  from  these  corrected  ordinates,  calculate  the  moments, 
which  will  then  not  require  any  further  adjustment. 

The  latter  course  has  an  advantage,  where  a series  determined 
by  a mathematical  formula  is  being  fitted  to  the  observed  series, 
in  that  ordinates  are  more  easily  calculated  from  the  formula  than 
areas,  so  that  the  calculation  of  the  true  central  ordinates  in  the 
observed  series  facilitates  a comparison  of  the  two.  They  also 
give,  sometimes,  a better  idea  of  the  nature  of  the  law  of  the  series 
than  do  the  areas. 

The  moments  may  be  computed  directly  by  selecting  any  value 
of  x as  origin  and  calculating  the  successive  values  of  xnUx  or 
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xnVx  and  summing.  Where  a series  of  moments  is  to  be  computed, 
however,  the  work  may  be  abbreviated  by  a summation  process 
as  follows: 

Suppose  that  w is  the  highest  value  of  x which  occurs  and 
Let 

'uH  = £ ux, 

A 

auh  - Eu  = £ i7,  £ = £ (x  - h + i)vx, 

A A A'  A 


o>  x 


'uh  = E = £ ux  £ (x  - h + 1) 

A A A 


V'  (fl  ~ ft  4-  1)(&  ft  4-  2) 

/ ^ 9 t/a 


"cr»  = £ *u.  = £ i/«  £ (x— -+  -1}2(a:  ft  + 2) 

AAA  ^ 


= £ 


(a;  --  ft  4-  l)(s  — ft  4-  2)(s  --  ft  4-  3) 
6 


= £"17. 


(x  — h 4- 1)  (x  — h 4-  2)  (x  — h + 3)  (x  — h 4-  4) 

/ V O A Oj. 

A 
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The  transformation  is  effected  in  each  case  by  expanding  and 
then  collecting  terms  involving  the  same  Ux.  Whence 

= £ ux, 


uUi  = £x{7,  = mh 


h(‘uUi  + 111  tA)  = 1£fUx  = hmi, 

™T7  V"*  x(x2  1)  ,r  , / v 

Ut  = 2_ g Ux  - i(w3  - mi), 

hCU*  + 'U3)  = £ X^-~-  Ux  = i-Mi  - t m).  (1) 


64.  It  is  usually  an  economy  of  labor  to  divide  the  series  into 
two  at  some  central  point  and  apply  the  summation  separately 
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to  the  two  sides  with  the  central  point  as  origin.  In  combining 
the  two  care  must  be  taken  regarding  the  signs,  the  summations 
giving  odd  moments  being  subtracted  and  those  giving  even 
moments  added.  The  following  is  an  example  of  the  working  of 
this  method.  The  numbers  included  in  brackets  represent  the 
final  quantities  entering  into  the  formulas  of  Art.  63. 


X. 

vx. 

&Vx-v 

Ux. 

'Ux. 

*ux. 

Ux. 

iyux. 

v ux . 

-5 

-4 

-3 

-2 

-1 

0 

1 

2 

3 

4 

5 

6 
7 

0 

17 

155 

449 

528 

522 

474 

355 

201 

126 

49 

43 

0 

17 

121 

156 

-215 

- 85 

- 42 

- 71 

- 35 
79 

- 2 
71 

- 37 
43 

12} 

148 

458 

532 

524 

477 

356 

198 

126 

46 

45  \ 
-2/ 

11 

159 

617 

1,149 

(2,919) 

1,246 

769 

413 

215 

89 

43 

11 

170 

787 

1,936 

(839) 

2,775 

1,529 

760 

347 

132 

43 

11 

181 

968 

(1,936) 

(6.134.5) 

(4.198.5) 
2,811 
1,282 

522 

175 

43 

11 

192 

1,160 

(3,673) 

4,833 

2,022 

740 

218 

43 

203 

(783) 

(6.222.5) 

(5.439.5) 
3,023 

The  second  difference  correction  only  is  used  in  this  case  and 
the  negative  values  appearing  for  Ux  are  combined  with  the  nearest 
positive  value.  The  combined  total  in  each  column  is  entered 
opposite  the  value  0 of  x,  this  and  other  adjusted  values  not  forming 
part  of  the  regular  summations  being  enclosed  in  brackets. 

We  have  then 

839  _ OQ7. 

Wl  2919  '287, 

2 X 6134.5  . rtno  a i oi 

1712  ~ 2919  ~ ~ 4.121; 

(m3  - m{)  = 6-29f§"  = 7.550;  m3  = 7.837;  w = 4.265; 

24  V 6222  5 

(m4  -m)  = = 51.161;  m4  = 55.364;  M4  = 48.424. 

65.  In  most  frequency  distributions  it  is  observed  that  a single 
well-defined  maximum  appears  and  that  as  we  depart  from  this 
value  in  either  direction  the  frequency  decreases  and  in  most  cases 
vanishes  at  the  extreme  limits.  In  graduating  a frequency  dis- 
tribution of  this  kind  by  a mathematical  formula,  a general  type 
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found  frequently  useful  is  that  given  by  the  differential  equation 

1 dy  _ - (b  + x) 
y dx  c + bx  + ax2’ 

where  the  mean  value  of  x is  taken  as  origin.  The  equations 
connecting  the  moments  with  the  constants  of  the  equation  may 
be  obtained  as  follows:  We  have 

(b  + x)ydx  = — (c  + bx  + ax2)dy. 

Multiplying  by  xn~l  and  integrating  we  have 

Xm  f*m 

(b  + x)xn~lydx  = — I (c  + bx  + ax^x^dy 

Xm 

{ (n  — 1 )cxn~2  + nbxn~l 

+ (w  + 1 )axn}ydx, 

so  that,  if  l and  m are  taken  so  that  (c  bx  + ax2)xn~ly  vanishes 
at  both  limits,  we  have 

6jUn_i  + Mn  = (n  — 1 ) C$1  n—2  4"  tt&Mn-l  4"  (?l  4“  1 )&Mn 
or 

{1  — (n  4-  1)o}m»  = (n  — 1)&m»-i  4 - in  — l)c/xn-2- 

Whence  putting  n successively  equal  to  2,  3 and  4 and  remember- 
ing that  mo  = 1 and  mi  = 0,  we  have 

(1  — 3a)M2  = c , 

(1  — 4g)m3  = 26m2, 

(1  — 5a)  m 4 = 36m3  4~  3cm2; 

whence  eliminating  b and  c we  get 

(1  - 5a)m  = 3(1-~--4?-  — + 3(1  - 3 a)tf 

Z /H2 

or,  dividing  through  by  yn £12  and  writing  ft  for  *i32/m23  and  ft  for 

M4/M22, 

2(1  - 5a)ft  = 3(1  - 4a)ft  + 6(1  - 3a) 


(2ft  - 3ft  - 6)  = (10ft  - 12ft  - 18)a, 

2ft  - 3ft  - 6 = 2 (ft  + 3)  - 3 (ft  + 4)  = 2 - 3y 
10ft  - 12ft  - 18  10(ft  + 3)  - 12  (ft  + 4)  10  - 12y  ’ 
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where 

_ ft  + 4 

7 ft  + 3 ’ 

, 1 - 4a  ju3  _ 1 /x3 . 

2 * M2  10  — 12y  M2 7 

4 — 3y 

c = (1  - 3 a)w  = 10  _ 12y  ' 


The  differential  equation  may  therefore  be  written  in  the  form 


d log  y = 1 dy 
dx  y dx 


where 


-{g  + (1°-127)*} 

(4  - 37)m2  + -x+ (2  - 3y)x2 
M2 

(2  + 5)m2  + — x + 5a;2 


5 = 2 — * 3y  = 


2ft  — 3ft  — 6 
ft + 3 


It  may  be  shown  that  for  any  real  frequency  distribution  y is 
positive  and  less  than  unity,  so  that  5 lies  between  2 and  — 1. 

66.  The  form  of  the  differential  equation  shows  that  in  general 
log  y becomes  infinite  either  positively  or  negatively  along  with 
x and  also  for  those  values  of  x for  which 


(2  *T  5)  M2  4“  ~ x -f-  5a;2  = 0. 

M2 

There  are  therefore  three  general  types  of  curve  represented  by 
this  equation,  according  as  the  roots  of  the  equation  are  real  and 
of  different  signs,  real  and  of  the  same  sign  or  complex.  In  the 
first  case  the  curve  is  limited  in  both  directions,  the  limiting  values 
being  the  two  roots  of  the  equation  for  which  values  y either 
vanishes  or  becomes  infinite  In  the  second  case  the  curve  is 
limited  in  one  direction  by  the  numerically  least  of  the  two  roots 
and  is  unlimited  in  the  other  direction,  while  in  the  third  case  it 
is  unlimited  in  both  directions.  Between  the  first  and  second 
cases  we  have  the  limiting  case  where  one  of  the  roots  is  infinite, 
and  between  the  second  and  third  we  have  the  case  of  equal  roots. 
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In  both  these  cases  the  curve  is  limited  in  one  direction.  There 
are  also  the  special  cases  where  jU3  = 0 and  the  curve  is  symmetrical. 
Here  we  have  the  two  general  cases  according  as  the  roots  are 
real  or  complex  with  the  limiting  case  where  they  are  infinite. 

67.  The  criterion  of  the  nature  of  the  roots  of  the  equation  is 

Ms2 

M22  _ 0i 

45(5  + 2)m2  45(5  -|-  2) 

If  this  is  negative,  which  is  true  when,  and  only  when,  5 is  negative, 
the  roots  are  real  and  of  different  sign.  If  it  is  positive  and  greater 
than  unity  they  are  real  and  of  the  same  sign  and  if  it  is  positive 
and  less  than  unity,  they  are  complex.  We  have  therefore  the 
eight  types  of  curve  determined  by  the  relative  values  of  0i  and  5 
as  follows: 


Type. 

Value  of 

Range  of  Curve. 

Pi. 

4S(5+2). 

I 

= 0 

<0 

Limited  both  directions 

II 

= 0 

= 0 

Unlimited 

Ill 

= 0 

> 0 

Unlimited 

IV 

> 0 

< 0 

Limited  both  directions 

V 

> 0 

= 0 

Limited  one  direction 

VI 

> 0 

> o <A 

Limited  one  direction 

VII 

> 0 

= A 

Limited  one  direction 

VIII 

> 0 

> 

Unlimited 

68.  The  function  to  be  integrated  in  the  case  of  each  type  is  a 
standard  form  taken  up  in  treatises  on  the  integral  calculus. 
We  will  therefore  merely  give  the  form  of  the  equation  of  the 
curve  of  frequency  in  each  case  in  its  simplest  form  and  the  equa- 
tions connecting  the  constants  involved  with  the  moments.  The 
functions  of  the  moments  which  enter  into  various  types  are  as 
follows : 


ft 


M4 

7 = 


ft + 4. 
02  + 3 } 


5 = 2 — 3y  = 


202  — 30i  — 6 

& + 3 


&2  — 


01 


45(5  + 2)  * 


Type  I: 


0i  = 0;  5 < 0. 
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Equation 

y = k(a2  - x2) (n  /2)~l, 

2 (l+J) 

” _ - 6 ’ 

o2  = M2  = (n  + 1)m2.  o > 0 

N r[(n  + l)/2] 
a"-1  Vir ' r(n/2)  ’ 


where  W is  the  total  number. 
Type  II: 


Equation 


ft  = 0;  5 = 0 

2/  = ke~*ic\ 
c2  = 2M2,  c > 0 


Type  III: 
Equation 


ft  = 0;  5 > 0 

t/  = k(a 2 -f  x2)~[(n/2)+l], 

„ 2(1  + 8) 

n~—r~’ 

9 8 2 . A 

a2  = — — jm2  = (n  — 1)^2,  a > 0 
_ T[(n/2)  + 1] 

A&  * r[(n  + l)/2]  * 


Type  IV: 

ft  > 0;  5 < 0. 

Equation 

2/  = k(a  — x)np~l(a  + x)"3”1, 
2(l_+_5) 

a2  = (1  — kb)  (n  -J-  l)n2}  > 0 

<« - p)° = p + 3 = h 
k n m 

(2a)n~l  T(np)r(nq)  ’ 

Wl  = 2^  = (®  - p)a‘ 
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Type  V: 
Equation 


ft  >0;  5 = 0. 
y — 

4 H3 
m “ft’’  ® ~ W 


h = 


N 


| a | • am~l  T(m) 1 
where  | a | is  the  absolute  value  of  a,  neglecting  sign, 


Type  VI: 
Equation 


mi  — ma. 

0 < 46(5  + 2)  < ft. 

y = k(x  + a)~(nq+1)(x  — a)np_1, 

20+ J) 

n~  & ’ 

a2  = (&2  — l)(w  — 1)^2,  a > 0 


(p  + q)a  = 


-ML 

28/12 


; q-p  = 1, 


k = N(2a)n+1 


r(ng  + 1) 

T(np)T(n  + 1)  ’ 


mi=2Spz=(p  + q)a- 

ft  > 0;  45(6  + 2)  = ft. 

y = kx~(n+2)e~('alx\ 


Type  VII: 
Equation 


n — 


a = 


2(1  + 6) 


or 


\n~2)  +4’ 


1 + 5 113  n(n  — 2)  /13 


52  ii2  4 
fc  = |a|  • an/T(n  + 1), 

V3  _ n — 2 M3  _ « 
25/^2  4 /i2  w’ 


M2 


mi  = 


Type  VIII: 


0 < ft  < 45(5  + 2). 
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Equation 

y = fc(a2  + 32)— (n/2+iytan-iOr/a)^ 

2(1  + g) 

n ~ d ’ 

a2  — (1  — k2)(n  — l)y2,  a > 0 
— * d Ms  _ n(n  — 2)  y, 3 
52  M2  4 M2  ’ 

/+7r/2 

cosn  6e~v6dd, 

-tt/2 

M3  n — 2 M3 
mi  = . — — = — . 

26m2  4 M2  w 

69.  To  illustrate  the  application  of  this  formula  take  the  fre- 
quency distribution  for  which  the  moments  have  already  been 
worked  out.  We  have 

M2  = 4.121;  M3  = 4.265;  M4  = 48.424, 

whence 

ft  = .26;  02  = 2.85;  y = .73;  8 = - .19. 

This  shows  that  the  curve  is  of  type  IV. 


k2  = — 


.26 


4 X .19  X 1.81 


= - .19; 


_ 2 X .81  1.62  0 c 

n .19  .19  8'5, 


a2  = 1.19  X 9.5  X 4.121  = 46.59; 
a = 6.9, 


(p  - q)a 


4.265 

2 X .19  X 4.121 


2.72; 


V ~ 3 = -4; 

p = .7;  q = .3; 
np  = 5.95;  nq  — 2.55; 


log  it  = log  2919  - 7.5  log.  13.8  + log  r(8.5)  - log  r(5.95) 

- log  T(2.55) 

= 3.465234  - 7.5  X 1.139879  + 4.147194  - 2.042232 

- .139169 

= 4.881934. 
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The  origin  must  be  taken  so  that  the  value  of  m i is  — 2.76, 
so  that  it  must  be  at  a point  .29  + 2.76  or  3.05  above  the  original 
origin,  and  the  range  of  the  curve  is  therefore  from  — 3.85  to 
+ 9.95  on  the  original  scale.  If  we  retransfer  to  the  original 
origin  the  equation  of  the  curve  takes  the  form 

y = k( 9.95  - z)4-95(3.85  + x )1-55. 


70.  If  we  ignore  the  value  of  5 and  consequently  select  type  V 
we  have 


m 


56  = 15‘5’ 


4.265 


.52, 


8.242 

log  k = log  2919  - 15.5  log  .52  - log  r(15.5) 

= 3.465234  + 15.5  X 1.716003  - 11.524835  = 12.674434, 

mi  = 8.06, 


and  the  equation  of  the  curve  referred  to  the  original  origin  takes 
the  form 

y = k(x  + 7.77)14-5e~t(x+7,77)  /,52k 

71.  Another  system  of  curves  sometimes  used  for  graduating 
frequency  distributions  is  that  for  which  the  general  equation  is 

y = A0<p(x)  + Az<p"\x)  + A^ix), 

where 

«.(*)  = -4=e-H-*W'*1 

<7  V2t r 


and  Ao,  AZf  A^b  and  a are  arbitrary  constants.  If  these  constants 
are  determined  by  the  method  of  moments  we  have  the  following 
equations : 


Ao  = N,  the  total  number;  b — mi;  <j2  = As 

A Am 4 - 3Am22 

Ai  ~ |4 


AM3 

[3’ 


also 
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and 


crV1 


Xx) 


<p(x)  1 3 — 


(x-by  . (x-b) 


+ 


} 


So  that  the  equation  may  be  written  in  terms  of  the  moments  as 
follows : 


y = N(f>(x)  1 + 


[i+^{ 


x — b (x  — by 


+ 


X 


i- 


ft  - 3 

li 


6 


(x  - by  (x  - by 

9 » A 


)]• 


In  applying  this  equation,  however,  to  a curve  of  very  marked 
skewness  the  formula  fails  through  the  appearance  of  negative 
ordinates  of  significant  value  which  are  inadmissible  as  representing 
a frequency  distribution.  It  is  only  applicable  therefore  where  the 
curve  approximates  to  the  normal  and  the  constants  As  and  A4 
represent  relatively  small  corrections.  (3) 


(b)  Mortality  Tables. 

72.  The  general  law  which  is  most  frequently  used  in  the  gradua- 
tion of  mortality  tables  is  the  one  devised  by  Makeham  as  a 
modification  of  the  Gompertz  formula.  According  to  this  law, 
the  force  of  mortality  at  any  given  age  x may  be  expressed  by 
the  function  A + Bcx.  Another  way  of  expressing  this  law  is  by 
the  equation  lx  = ksxgc*.  In  the  expression  for  the  force  of  mo- 
tality  there  are  three  arbitrary  constants  A,  B and  c,  while  in  the 
expression  for  the  value  of  lx  there  are  four  but  the  fourth  constant 
merely  determines  the  radix  of  the  mortality  table  and  does  not 
affect  in  any  way  the  rate  of  mortality,  nor  the  monetary  values 
depending  thereon.  The  arbitrary  constants  may  be  determined 
directly  from  the  original  data  of  the  exposed  to  risk  and  deaths  or 
a mortality  table  may  have  been  constructed  from  the  unadjusted 
death  rates  and  the  original  data  may  not  be  available  or  it  may 
be  desired  to  graduate  the  rough  table  without  direct  reference  to 
the  original  data.  (34) 

73.  Where  a graduated  table  is  to  be  constructed  directly  from 
the  original  data,  it  is  usually  most  convenient  to  so  determine  the 
exposed  to  risk  as  to  give  central  death  rates  or  force  of  mortality, 
rather  than  rates  of  mortality.  When  this  is  done  the  most 
difficult  part  of  the  problem  is  the  determination  of  the  value  of  c. 
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If  the  value  of  c is  known  and  if  we  designate  by  Lx  the  number 
exposed  to  risk  in  the  middle  of  the  year  of  age  x and  by  6X,  the 
number  of  deaths  occurring  in  the  same  year  of  age,  we  may  deter- 
mine A and  B from  the  simultaneous  simple  equations 


A2LX  + B2Lxcx+*  = 20x, 
A2xLx  + B2Lxxcx+ * = 2x6  x. 


These  equations  express  the  conditions  that  the  total  number  of 
actual  deaths  is  equal  to  the  expected  and  that  the  mean  age  is 
the  same  in  the  two  groups.  The  problem  under  this  method 
of  graduation,  therefore,  substantially  resolves  itself  into  the 
determination  of  the  value  of  c. 

74.  One  method  which  has  been  suggested  for  determining  this 
value  is  by  fitting  a frequency  curve  of  type  V to  the  exposed  to 
risk,  and  recomputing  the  deaths  at  each  age  by  multiplying  the 
adjusted  value  of  the  exposed  by  the  unadjusted  force  of  mortality. 
In  this  case  the  equation  of  the  curve  is  y = kxm~le~(x/a\  where  x is 
not  necessarily  measured  from  age  0,  but  the  origin  is  determined 
according  to  the  methods  already  described  (Art.  68).  In  this 
case  let  E 0,  Eh  etc.,  represent  the  successive  moments  for  the 
exposures  around  the  origin,  E0r,  Ei,  etc.,  the  similar  moments 
of  the  exposures  multiplied  by  cx  and  0O,  0i,  etc.,  the  similar 
moments  of  the  recomputed  deaths.  If  then  we  put  X for  log  c, 
the  force  of  mortality  may  be  written  in  the  form  A + Be** 
and  the  equation  for  the  adjusted  curve  of  deaths  will  be 
y = Akxm~le~{x  la)  + Bkxm~lex[*~ola)^,  where  the  second  term  is  of 
the  same  form  as  the  first  except  that  (1/a)  — X is  substituted  for 
1/a.  Then  we  have  by  the  well-known  properties  of  the  gamma 
integral 

0o  ==  AE o 4“  BEq 


or 


ef  = A + B^- 

Joj  o 


or 


0i  = AEi  + ££/  = AmaE  0 + B 


0i 


ma 


a\ 


Eg 


Ei  A + B 1 - a\  Eo  ’ 

02  = AE,.  + BE-/  = Am(m  + 1 )a?E0  + E/ 
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or 


ft 

E i 


ft 

Eq 


w = A + B 

Jh  2 

a\ 


E0' 


1 - a\  E< 


(1  - a\)  = 


(1  - a\)2E0 
ft  _ ft 

Ei  Ei 

ft  ft 

Ei  Eq 


z?  _ il  - 

n TP  > TP  TP 


a\  p 2iV 
(1  - a\y  E^; 


ft 

e2 


ft 

Ei 


from  which  X may  be  determined. 

75.  This  method  is  open  to  the  objection  that  ordinarily  the 
exposed  to  risk  at  the  extreme  ages  will  be  overstated  by  the  fre- 
quency curve,  and  significant  values  may  be  obtained  for  ages 
which  are  not  represented  among  the  actual  observations,  and  it 
will  therefore  be  necessary  to  use  hypothetical  death  rates  at  these 
ages.  A curve  of  type  IV  might  be  used  instead,  but  in  that  case 
it  is  necessary  to  take  into  account  the  third  moments  of  the 
recomputed  deaths  and  the  weight  of  the  determination  of  the 
value  of  c is  thus  very  much  reduced.  (1) 

76.  Another  method  is  to  combine  the  exposed  to  risk  into  a few 
large  groups  and  to  fit  a binomial  series  to  the  series  so  obtained* 
Let  us  suppose  that  in  the  substituted  series  the  exposed  to  risk 
at  age  x is  the  coefficient  of  yx~l  in  the  expansion  of  N(p  + qya)n 
where  p + q = 1.  Then  we  have 

to  = na2pq , 

p3  = na3pq(p  - q), 

m = a*{npq(  1 — 6pq)  + 3 n2p2q2} ; 

(p  - q)2 


whence 


npq 


02  = 3 + 


- - 6), 
n\pq  J1 


n — 


3 + 0i  - 02  ? 
a2  = /x2(0i  + 4/n) 
a(P  - q)  = M3/M2 

mi  — l — nqa. 

The  quantities  a and  ft,  however,  are  for  convenience  ordinarily 
taken  as  integral  so  that  these  equations  can  only  be  approximately 
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satisfied.  When  the  substituted  series  has,  however,  been  deter- 
mined it  is  necessary  to  multiply  each  term  by  the  proper  force 
of  mortality,  and  the  most  convenient  way  of  determining  the 
proper  force  of  mortality  to  use  is  by  dividing  the  exposed  to  risk 
and  the  deaths  into  groups  such  that  the  central  ordinate  of  each 
group  will  correspond  to  one  of  the  ages  for  which  the  force  of 
mortality  is  required.  The  central  ordinate  of  each  group  can 
then  be  determined  by  the  formula  already  given  in  Art.  62 
and  the  force  of  mortality  determined  by  making  the  proper 
division.  By  this  method  all  of  the  data  is  brought  into  the 
calculation.  If  then  we  equate  the  actual  number  and  the  first 
and  second  moments  taken  about  the  point  x = l of  the  recom- 
puted deaths  to  the  corresponding  functions  of  the  expected  we 
have  the  following  equations  for  determining  the  value  of  c 


$o  = AE  o + B(p  + qca)nEo, 

0i  = AnaqEo  + Bnaqca(p  + qca)n~lEo , 

62  — a0 1 = An{n  — l)aV^o 

+ Bn(n  — 1 )a?q2c2a(p  + qca)n~2E0} 

Y = A + B(p  + gc‘)", 

Hi  o 

^ = A + Bca(p  + qe)n~\ 


02  — &0 1 

E2  — dEi 

0i  0o 

Ei  E0 

02  — CL0i  0\ 

E2  — oEi  Ei 

0i 0o_ 

Ei  E0 

02  — O0 1 0i 

E2  — oEi  Ei 


— A + Bc2a(p  + qca)n~2} 

= Bp(ca  — l)(p  + qc 
= Bpc°(ca  — 1 )(p  + qca)n~ 2, 


p + qca 
ca 


pc~a  + q. 


77.  Another  method  of  determining  the  value  of  c direct  from 
the  original  data  suggested  by  Mr.  Hardy  is  to  arrange  the  exposed 
to  risk  and  the  deaths  in  quinquennial  groups,  calculating  the 
central  ordinates  for  each  group  and  the  corresponding  force  of 
mortality  by  the  method  just  outlined.  This  gives  us  a series  of 
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values  of  px  proceeding  by  intervals  of  5 years.  The  values  at 
the  extreme  ages  are  rejected  and  beginning  at  say  32J  the  six 
successive  values  covering  ages  32 J to  57§  are  combined  after 
weighting  them  respectively  by  the  factors  1,  3,  5,  5,  3,  1.  The 
six  values  for  ages  47  J to  72 J are  similarly  combined  and  also 
those  for  ages  62§  to  87J.  We  have  then  the  following  equations. 

51  = 18  A + Bc32*(l  + 3c5  + 5c10  + 5c15  -f  3c20  + c25) 

= ISA  + Bc?2*f(c), 

52  = ISA  + Bc«¥(c), 

53  = 18  A + Bc^f(c), 

52  — Sx  = Bc32Kc15  - 1 )/(c), 

53  - S2  = J5c^(c15  - l)/(c), 


78.  It  has  been  found  by  experience  that  the  value  of  log  c in 
practically  every  case  falls  between  .035  and  .045  and  satisfactory 
results  can  sometimes  be  obtained  with  much  less  labor  than  by 
either  of  the  above  methods  by  using  trial  values  of  log  c beginning 
with  .04.  When  this  method  is  followed  the  graduated  table 
resulting  from  each  value  of  c is  tested  in  the  usual  method  and 
the  table  which  gives  the  most  satisfactory  results  is  selected. 

79.  It  will  be  found,  however,  that,  unless  the  value  of  log  c 
which  is  adopted  agrees  closely  with  the  data,  if  the  values  of  A 
and  B are  derived  according  to  the  method  described  in  Art.  73 
and  the  corresponding  rates  of  mortality  computed  and  applied 
to  the  exposed  to  risk  the  total  number  of  deaths  and  the  first 
moment  will  not  be  reproduced.  If,  therefore,  it  is  desired  to 
obtain  a mortality  table  based  on  a selected  value  of  c,  not  neces- 
sarily agreeing  with  the  data,  which  will  reproduce  the  total 
number  of  deaths  and  the  first  moment  a different  process  is 
required.  We  have 

cologio  px  = a + ficx  = a + c*“™; 

.*.  px  = 10  ~ap'x-a  = rpr  x—aj 

where  pj  is  so  determined  that  cologio  pj  = cx~m+a.  We  there- 
fore select  an  arbitrary  value  of  m — a and  the  problem  reduces 
to  the  determination  of  the  values  of  r and  a.  The  method  of 
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Art.  73  will  give  an  approximate  value  of  a and  it  is  then  easy 
by  a method  of  trial  and  error  to  determine  the  value  of  a for  which 

'ZxExp'x-a  _ 2z(j Ex  ~ 0X) 

XExp'x-a  ~ S (Ex  ~ Ox)  ' 

We  then  determine  r from  the  equation  rXExp'x-a  = 2(EX  — 0X )• 
From  these  two  equations  it  follows  that  where  px  = rp'x-a 
we  have 

S {Exqx  - 0X)  = 0, 

Hx{Exqx  — 0X)  — 0. 

In  other  words  the  sum  of  the  deviations  and  of  the  accumulated 
deviations  will  vanish. 

80.  Applying  the  method  of  Art.  73  to  the  mortality  experience 
given  on  page  5 we  obtain  Cologio  (pz)  = .00193  + .00003737c*  on 
the  assumption  that  logi0c  = .04.  This  gives  us  110.7  as  an 
approximate  value  of  m.  Taking  therefore  100  as  the  arbitrary 
value  of  m — a and  using  10  and  11  as  trial  values,  we  have 

2(100  - x){Ex  - Ox)  _ 82,938 
2(EX  - 0X)  3,220 

2(100  - x)Exp,x- io  _ 82,868.06 
2Exp'x-io  3,214.21 

2(100  - x)Exp'x-n  __  83,501.88 
2Exp'x- ii  ' 3,246.59 

We  take  therefore  10.4  as  the  value  of  a and  we  have 

_ 2 (Ex-  Ox)  _ 3,220 
r 2jS?4>'._io.4  3,227.16’ 

Cologior  = .00096. 

Hence  we  have  finally 

Cologio  px  = .00096  + lO04^-110-4). 

The  ^able  on  page  63  shows  the  resulting  rates  of  mortality  and 
the  comparison  of  the  actual  with  the  expected  claims. 

81.  The  value  of  c determined  by  any  of  these  methods  may, 
however,  be  considered  as  approximate  only  and  in  this  case,  if  the 
true  value  of  c be  designated  by  c + 5c,  the  expression  for  px 


= 25.7571, 
= 25.7818, 
= 25.7199. 
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Graduation  of  Data  by  Makeham’s  Formula 


qx. 

Expected 

Deaths. 

Actual 

Deaths. 

Deviations. 

Accumulated 

Deviation. 

55 

.0161 

.0 

0 

.0 

.0 

56 

.0174 

.1 

0 

+ 

.1 

+ 

.1 

57 

.0189 

.2 

0 

+ 

.2 

+ 

.3 

58 

.0205 

.4 

0 

+ 

.4 

+ 

.7 

59 

.0222 

.7 

1 

.3 

+ 

.4 

60 

.0241 

1.2 

1 

+ 

.2 

+ 

.6 

61 

.0262 

1.5 

3 

1.5 

.9 

62 

.0285 

2.1 

2 

+ 

.1 

— 

.8 

63 

.0310 

2.6 

0 

+ 

2.6 

+ 

1.8 

64 

.0337 

3.4 

4 

.6 

+ 

1.2* 

65 

.0367 

3.9 

1 

+ 

2.9 

+ 

4.1 

66 

.0400 

4.6 

1 

+ 

3.6 

+ 

7.7 

67 

.0435 

5.6 

3 

+ 

2.6 

+ 

10.3 

68 

.0474 

6.3 

5 

+ 

1.3 

+ 

11.6 

69 

.0517 

7.0 

11 

4.0 

+ 

7.6 

70 

.0563 

7.6 

6 

+ 

1.6 

+ 

9.2 

71 

.0614 

8.8 

12 

3.2 

+ 

6.0 

72 

.0669 

9.4 

10 

— 

.6 

+ 

5.4 

73 

.0729 

10.5 

11 

— 

.5 

+ 

4.9 

74 

.0794 

11.8 

6 

+ 

5.8 

+ 

10.7 

75 

.0866 

13.3 

16 

_ 

2.7 

+ 

8.0 

76 

.0944 

14.2 

24 

— 

9.8 

— 

1.8 

77 

.103 

14.3 

8 

+ 

6.3 

+ 

4.5 

78 

.112 

16.2 

16 

+ 

.2 

+ 

4.7 

79 

.122 

17.1 

13 

+ 

4.1 

+ 

8.8 

80 

.133 

18.2 

19 



.8 

+ 

8.0 

81 

.144 

19.6 

21 

— 

1.4 

+ 

6.6 

82 

.157 

19.8 

23 

— 

3.2 

+ 

3.4 

83 

.170 

21.4 

26 

— 

4.6 

— 

1.2 

84 

.185 

20.2 

26 

— 

5.8 

— 

7.0 

85 

.201 

18.3 

23 



4.7 



11.7 

86 

.218 

16.8 

21 

— 

4.2 

— 

15.9 

87 

.236 

15.6 

16 

— 

.4 

— 

16.3 

88 

.255 

13.8 

12 

+ 

1.8 

— 

14.5 

89 

.27£ 

13.5 

15 

1.5 

— 

16.0 

90 

.298 

11.6 

9 

+ 

2.6 



13.4 

91 

.321 

10.0 

7 

+ 

3.0 

— 

10.4 

92 

.346 

9.3 

6 

+ 

3.3 

— 

7.1 

93 

.372 

8.2 

7 

+ 

1.2 

— 

5.9 

94 

.400 

6.0 

2 

+ 

4.0 

— 

1.9 

95 

.429 

5.1 

3 

+ 

2.1 

+ 

•2] 

96 

.459 

3.7 

4 

.3 

.1 

97 

.490 

2.0 

1 

+ 

1.0 

+ 

.9 

98 

.521 

1.6 

2 

.4 

+ 

.5 

99 

.554 

.6 

1 

— 

.4 

+ 

.1 

398.1 

398 

+51.0 

-50.9 

+128.3 

-124.9 
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becomes  approximately  A + Bcx  -f-  Bxc^dc,  which  may  be 
writt  en  in  the  form  A + Bcx  + Dxcx,  where  D = Bdc/c.  If  we 
then  equate  the  total  number  and  the  first  and  second  moments 
of  the  actual  deaths  with  the  corresponding  functions  of  the 
expected,  we  have  the  following  equations  for  determining  A, 
B and  D 

AEq  -f-  BE  o'  -f-  DEi  = do , 

AE\  -f-  BE\  -{-  BE 2 — dif 
AE2  -f-  BE 2 "f"  BEz  — 02- 


In  this  case  the  moments  are  all  taken  about  the  value  zero  of  x. 

82.  When  a mortality  table  is  to  be  graduated  without  direct 
reference  to  the  original  data  the  simplest  way  in  which  the 
constants  can  be  determined  is  by  using  four  equidistant  values 
of  log  lx  in  which  case  we  have  the  following  equations  for  deter- 
mining the  constants : 

log  lx  = log  k + x log  s + cx  log  g, 

log  lx+t  = log  k + (x  + t)  log  s + cx+t  log  g, 

log  lx+2t  = log  k + (x  + 2 1)  log  s + cx+zt  log  g} 

log  lx+3t  = log  k + (x  + 3 1)  log  s + cx+3t  log  g. 


Denoting  then  the  difference  over  an  interval  t by  the  symbol  A, 
we  have 

A log  lx  = t logs  cx(c*  — 1)  log  g, 

A2  log  lx  = c*(c*  - l)2  log  g, 

whence 


c*  — 
log  g = 


A2  log  lx+t 
A2  log  lx  1 
A2  log  lx 
c*(c‘  - l)2’ 


tlog  s = A log  lx  — c^c*  — 1)  log  g = A log  lx 


log  k = log  lx  — x log  s — cx  log  g. 


A2  log  lx 
c*  -1  ’ 


83.  It  will  be  seen  that  by  this  method  the  integral  of  [ix  over 
each  interval  is  made  the  same  in  the  graduated  table  as  in  the 
ungraduated  and  that  the  experience  not  included  in  the  three 
intervals  is  ignored.  In  view  of  the  somewhat  excessive  im- 
portance placed  on  particular  points  of  division  by  this  method 
and  of  the  varying  results  which  may  be  obtained  by  the  selection 
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of  different  ages  and  intervals,  this  method  is  usually  modified 
by  using  instead  of  a single  value  of  log  lx  the  sum  of  the  values 
for  a group  of  adjacent  ages.  In  this  case  we  have  the  following 
equations  for  determining  the  constants : 

x+n-l  / ft  _ cn_l 

X log  lx  = n log  k + n ( x H 1 log  s + cx  - _ ^ log  g, 

AS  log  lx  = nt  log  s + — — “1°S  9> 


A2S  log  lx 


c*(c‘  - l)2(c”  - 1) 

c — 1 


log  g, 


whence 

_ A2Z  log  lx+t 
C A22  log  lx  ’ 

, (c  — 1)A2S  log  lx 

0g  9 - cx(c‘  - l)2(c»  - 1)  ’ 

ni  log  s = AS  log  lx  — C--C — log  g 

C X 


= AS  log  lx  — 


A2S  log  lx 
cl  — 1 ’ 


i7  ^ii  niflx  + n — 1) . 
n log  k = S log  lx g " l°g  s ~ 


cx(cn  _ i) 

c — 1 


log  g- 


Where  the  number  n of  ages  included  in  each  group  is  the  same 
as  the  interval  t , this  method  becomes  the  one  proposed  by  King 
and  Hardy  and  described  in  the  Institute  Text-Book.  (35)  (36) 
84.  This  modification  minimizes  but  does  not  entirely  eliminate 
the  importance  given  to  special  points  of  division  and  Professor 
Karl  Pearson  has  suggested  the  application  of  the  methods  of 
moments  direct  to  the  values  of  log  lx , determining  the  constants 
so  as  to  reproduce  the  integral  of  log  lx,  within  suitable  limits^ 
and  its  first,  second  and  third  moments.  As,  however,  we  have 
only  the  value  of  log  lx  for  integral  values  of  x,  this  method  in  its 
original  form  requires  the  use  of  a formula  of  approximate  integra- 
tion. The  work,  however,  can  be  simplified  without  interfering 
with  the  simplicity  of  the  resulting  equations  by  using  summation 
instead  of  integration.  Under  this  method  we  have  t he  following 
equations  for  determining  the  constants: 
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log  la+x  = log  k + (a  + x)  log  s + ca+x  log  g 

= (log  k + a log  s)  -j-  x log  s + cx-ca  log  g 
= log  k'  + x log  s + cx  log  g', 

x~  1 /jj(2)  qX  — J 

lSx  = Z)  log  la+x  = X log  V + ~2~  log  S + ■ J log  g'y 


2$x  — i$x  — 2 log  ^ | 3 log  ^ 

*-I  x(3)  j;  (4) 

zSx  = 2^  = -j-g  log  k'  + -yj  log  S 

{pX  __  1 /y  7*  (2)  1 

(c  - l)3  “ (c  - l)2  “ 2(c  - 1)  } log9'* 


.(4) 


•(5) 


A - EA^ulog^  + fglog 


{pX  1 rp  <r>(2)  7*  (3)  1 

(c  - l)4  “ (c  - l)3  “ 2(c  - l)2  ~ [3  (c  - 1)  J log  gf 


where,  generally,  x(r)  = x{x  — 1)  • • • (x  — r + 1). 


Putting  then  n for  x in  these  last  four  equations,  where  n is  the 
total  range  to  be  included  in  the  summations,  and  eliminating 
log  k',  log  s and  log  g'y  by  multiplying  the  equations  through  by 
2 /n,  [3/n (2),  |4/n(s)  and  [5/n(4)  respectively,  differencing  twice  and 
taking  the  ratio  of  the  two  second  differences  we  obtain  an  equa- 
tion in  c which  may  be  solved  by  successive  approximations  to 
any  required  degree  of  accuracy.  The  other  constants  are  then 
determined  from  simple  equations.  (37) 

85.  Another  method  of  graduating  mortality  tables  is  to  make 
use  of  a hypothetical  table  of  exposed  to  risk  conforming  with  a 
frequency  distribution  of  such  a type  as  to  give  manageable 
equations  for  determining  the  constants.  The  actual  working  of 
this  method  is  the  same  as  in  the  case  where  a graduated  table  is 
constructed  direct  from  the  original  data  by  fitting  a frequency 
curve  to  the  exposed  to  risk,  as  described  in  article  74. 

86.  In  the  preceding  discussion  it  has  been  assumed  that  the 
mortality  is  analyzed  only  according  to  attained  age,  and  where 
select  or  analyzed  tables  are  required  some  modification  of  the 
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method  is  usually  necessary.  This  is  usually  done  by  making  A 
and  B functions  of  the  duration,  the  value  of  c being  taken  as  the 
same  for  all  durations.  For  practical  reasons,  the  functions  A 
and  B are  usually  assumed  to  become  constant  after  some  definite 
duration,  which  in  some  cases  is  fixed  at  five  years  and  in  some 
cases  at  ten  years.  The  method  usually  followed  is  to  set  out  the 
experience  for  each  year  of  duration  so  far  as  it  is  intended  to 
follow  selection  and  to  determine  by  some  of  the  methods  already 
described  the  corresponding  values  of  A and  B,  the  data  for  each 
year  of  duration  being  treated  as  representing  a mortality  table 
complete  in  itself.  These  values  will,  however,  be  somewhat 
irregular,  so  that  they  themselves  require  further  graduation. 
In  graduating  tables  of  this  kind  it  is  usually  better  to  work  with 
colog  px  rather  than  fxX)  the  general  form  of  the  two  functions  being 
identical.  In  selecting  formulas  for  graduating  these  rough  values, 
the  following  conditions  should  be  satisfied:  (1)  A smooth  junction 
between  the  curves  representing  the  select  and  ultimate  tables. 
(2)  An  agreement  between  the  graduated  and  ungraduated  values 
of  colog  px  in  year  0 as  special  importance  is  attached  to  the  rate 
of  mortality  in  the  first  year  of  insurance.  (3)  An  agreement 
between  the  aggregate  graduated  and  ungraduated  values  of  these 
functions  during  the  period  between  the  date  of  entry  and  the 
ultimate  table.  Considerable  experimenting  will  usually  be  neces- 
sary to  determine  a function  complying  with  these  conditions. 
The  final  form  of  the  equation  which  was  adopted  for  the  0[M] 
experience  was  as  follows : 

logio  l[x]+t  = logio  lx+t  — ft  — 

where 

ft  = m(10  — t)2  + m'(c')S 

and 

*t  = n(10  - t)2.  (1)  (40)  (41) 

87.  When  an  analyzed  mortality  table  is  being  constructed  it 
may  be  desired  to  merge  the  analyzed  tables  into  the  ultimate  at  a 
duration  somewhat  shorter  than  that  over  which  the  effect  of 
selection  is  known  to  extend.  In  this  case  a mortality  table  based 
on  the  average  experience  for  all  longer  durations  will  not  give 
correct  values  for  the  annuity  or  for  the  expectation  of  life  at  the 
point  of  junction,  but  will  give  values  too  high  at  the  young  ages 
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and  too  low  at  the  older  ages,  and  this  error  will  affect  the  corre- 
sponding values  at  date  of  entry.  If,  therefore,  the  values  at  date 
of  entry  are  more  important  than  those  for  longer  durations,  it  is 
necessary  to  so  determine  the  constants  as  to  reproduce  those 
values  as  nearly  as  possible.  The  problem  presented  is  therefore 
to  graduate  by  Makeham’s  formula  a series  of  annuity  values 
or  of  values  of  the  expectation  of  life.  As  the  expectation  of  life 
is  merely  the  annuity  value  for  zero  interest  the  annuity  may  be 
taken  as  the  general  case  for  discussion. 

The  same  problem  also  arises  where  a mortality  table  has  been 
in  general  use  and  has  been  adopted  as  a standard,  and  it  is  desired 
to  regraduate  it  by  Makeham’s  law  for  certain  special  purposes. 
In  this  case  also  it  is  important  that  the  monetary  values  should 
be  reproduced  as  nearly  as  possible. 

88.  In  the  following  discussion  we  will  assume  that  approximate 
values  of  the  constants  are  known.  These  values  may  be  derived 
by  any  of  the  methods  already  described  which  will  apply  to  the 
particular  case.  The  next  step  is  then  to  assign  weights  to  the 
various  ungraduated  values  used  as  a basis.  In  the  case  of  an 
analyzed  table  these  values  will  ordinarily  proceed  by  quin- 
quennial or  other  intervals,  each  value  being  derived  from  the 
experience  of  a group  of  entry  ages  suitably  corrected  to  reduce 
it  to  the  central  age  of  the  group.  If  then  n is  the  total  number 
exposed  to  risk  and  nq  the  total  number  of  deaths  in  the  group 
upon  which  the  annuity  value  is  based  the  mean  deviation,  irre- 
spective  of  sign  in  the  number  of  deaths,  is  approximately 
.8  Vng(  1 — q),  the  ratio  of  which  to  the  total  number  of  deaths  is 
.8  V(1  — q)/nq.  If  now  from  the  approximate  graduation  we 
construct  a table  showing  the  change  in  the  various  values  of  ax 
corresponding  to  a change  of  1 per  cent,  in  the  mortality,  the 
average  deviation  in  the  annuity  values  may  be  expressed  approxi- 
mately by  the  product  of  one  hundred  times  this  change  and  the 
above  ratio.  The  annuity  values  may  then  be  reduced  to  the 
same  weight  by  multiplying  them  by  numbers  in  proportion  to  the 
reciprocals  of  their  average  deviation.  If  a standard  table  is  being 
regraduated  weights  may  be  assigned  in  proportion  to  the  relative 
importance  of  the  different  sections  of  the  table.  (1) 

89.  Suppose  now  that  the  force  of  mortality  may  be  expressed  in 
the  form  nx  = A + Bcx  = A'  + h -f-  B'cx+k,  where  Af  and  B ' 
are  approximate  values  of  A and  B , and  let  accented  letters 
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designate  values  computed  on  the  basis  of  these  approximate 
values.  Then  we  have  approximately 


= ctx  - h{Ia')x  + k ^ . 


The  values  of  h and  k are  then  determined  by  the  method  of 
moments.  If  we  designate  the  ungraduated  values  of  ax  by  a 
double  accent  and  denote  the  weights  by  wx  the  equations  are 


— ax)  = — h2wx(Ia')x  + khwx- 


dax 
dx  1 


\ 2xwx(ax ' — aj)  = 


— h2xwx(Ia')x  + k2xwx- 


daxf 

dx 


If  the  values  of  h and  k so  determined  are  considerable  they 
should  not  be  accepted  as  final  but  the  constants  so  arrived  at 
should  be  used  as  a basis  for  a further  approximation. 

90.  In  the  foregoing  it  has  been  assumed  that  the  value  of  c was 
known.  Where  it  is  to  be  determined  we  may  assume  trial  values 
of  c and  graduate  by  the  above  method  and  then  select  the  value 
giving  the  best  graduation  or  we  may  suppose  it  subject  to  varia- 
tion and  adopt  the  following  process. 

Let 

fix  — A + Bcx  = A + Be?"* 

= A'  4-  h + 

Then  we  have 


- , . , dax  1 daj  1 dax 
a*  = ax  +hsp  + k1£  + lw- 


dx 


And  h,  k and  l may  be  determined  by  the  method  of  moments,  the 
second  moments  being  brought  in  to  furnish  a third  equation. 

91.  To  obtain  an  expression  for  ddJ/dX  let  us  designate  the 
value  of  ax  calculated  on  the  basis  of  the  constants  A,  B and  X by 
f(A,  By  X,  x).  Then  we  have 

/{(A  +rA+T),  (1  + r)B,  (1  + r)X,  x} 

1 

1 + r 


f{A,  B,  X,  (1  + r)x) 
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for  all  values  of  r,  when  8 is  the  force  of  interest.  Let  then  r 
become  indefinitely  small  and  we  have 


also 
so  that 


/t  i da x i rj  dax  | « ddx  ddx  — 

(A  + 5)_  + B_  + x__  = a;__0 

f(A,  B,\,x  + h)  = f(A,  BeKh,  X,  x), 


xy 


» p ddx  do>x 
™ dB  = dx; 


we  have  therefore 


-(*4)§  + a + W5).-5. 


It  is  to  be  noted  that  these  relations  are  obtained  for  continuous 
annuity  values  and  are  not  strictly  correct  for  annual  annuities. 

92.  In  some  cases  where  a mortality  table  cannot  be  represented 
by  Makeham’s  law  in  its  simplest  form  without  change  of  con- 
stants, we«may  put  lx  = U =L  lx"  where  lx'  and  lx"  each  follow 
that  law.  In  such  cases  it  is  of  advantage  to  use  if  possible  the 
same  value  of  c in  each  of  the  subsidiary  tables.  The  constants 
lx  are  determined  from  the  experience  at  the  older  ages  and  the 
constants  of  lx"  are  then  determined  so  as  to  fit  the  values  of 
± (lx  — IJ)'  It  has  also  been  suggested  to  modify  the  formula 
by  adding  an  additional  term  and  putting  nx  = A -f-  Hx  + Bcx, 
and  Mr.  Hardy  found,  in  graduating  the  0M  table,  that  where 
the  table  had  been  graduated  by  Makeham’s  law,  he  could 
use  a relation  of  the  form 

Cologio  (Pr)°M  = cologio  (Pz)°MW  - i:  + Qe~b<^). 

x 

Or,  in  other  words,  the  differences  between  the  values  of 
A Cologio  px  according  to  the  two  tables  was  represented  by  a 
double  frequency  curve. 

93.  Another  formula  intended  to  cover  the  whole  range  of  the 
mortality  table  from  infancy  to  extinction  was  suggested  by 
Wittstein.  His  assumption  is 

qx  = a~{M~z)n  + — a~(mx)n. 

m 


Here  M is  one  year  less  than  the  limiting  age.  To  determine 
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m we  have 


dqx 

dx 


= n(M  — z)n_1  log  a a~{M~x)n 


n(mx)n~l  log  a a“(mx)n. 


This  evidently  vanishes  when  mx  — M — x or  x = M/(m  + 1). 
The  age  for  which  qx  is  to  be  a minimum  is  then  decided  upon  and 
m is  determined  by  the  relation  m + 1 = M/x.  Also  we  have 
qo  = (1/m)  + a~Mn.  The  value  of  m should  be  so  determined  as 
to  reconcile  as  well  as  possible  these  two  considerations.  The  term 
(1  lm)a~(mx)n  becomes  negligible  for  adult  ages  and  in  order  to 
determine  the  values  of  a and  n we  may  put 


or 


qx  = a-w~x)n 


log  log  — = n log  (M  — x)  -f  log  log  a. 

qx 


The  respective  terms  are  then  properly  weighted  and  the  values 
of  n and  a determined  by  the  method  of  moments  or  by  the 
method  of  least  squares.  It  is  evident  that  if  we  denote  a~iM~x)n 
by  qj  and  (l/m)a_(m*)n  by  qx " we  have  qx  — qj  + q. " • Then 
for  infantile  ages  where  the  second  term  preponderates  we  have 


(<?*-<?* ')  = -«-(mx)n. 


This  may  if  desired  be  used  to  calculate  by  the  method  of  moments 
or  of  least  squares  a corrected  value  of  m.  It  was  thought  by  Mr. 
Wittstein  that  the  quantities  a and  n might  prove  to  be  absolute 
constants  with  a value  of  a in  the  neighborhood  of  1.421  and  of  n 
in  the  neighborhood  of  .633.  (42) 

Comparison  of  Different  Methods. 

94.  We  have  now  four  different  graduations  of  the  same  mor- 
tality experience.  We  may  therefore  make  a comparison  of  these 
graduations  from  the  standpoint  of  smoothness  and  of  agreement 
with  the  original  data.  From  the  standpoint  of  smoothness  we 
take  out  the  third  differences  of  the  rates  of  mortality  and  add 
together  their  numerical  values  regardless  of  sign.  As,  however, 
the  summation  graduation  formula  was  modified  at  each  extremity 
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over  substantially  a section  of  fourteen  ages,  we  break  the  42 
third  differences  into  three  groups  of  14  each  and  record  the  total 
of  each  group.  The  third  differences  over  five-year  intervals  were 
also  taken  out  and  the  thirty  such  differences  for  each  graduation 
added  together  regardless  of  sign.  This  latter  total  is  an  indication 
of  the  extent  to  which  major  irregularities  are  left  in  the  graduated 
table.  The  results  are  as  follows: 


Sum  of  Third  Differences. 

Unit  Interval. 

5-Year  Inter- 
val, Total. 

1st  Group. 

2d  Group. 

3d  Group. 

Total. 

Graphic 

.0030 

.0048 

.0190 

.0268 

.4921 

Interpolation 

.0019 

.0163 

.0280 

.0462 

.9814 

Summation 

.0032 

.0137 

.0090 

.0259 

.8080 

Makeham 

.0011 

.0097 

.0090 

.0198 

.1436 

95.  From  the  above  comparison  it  appears  that,  as  applied  in 
these  sample  graduations,  the  graphic  and  summation  methods  are 
about  equal  in  their  effect  on  minor  irregularities  being  each 
somewhat  less  powerful  in  this  respect  than  Makeham’s  formula 
and  more  powerful  than  the  interpolation  method.  When  we 
examine  major  irregularities  the  Makeham  graduation  is,  as  was 
to  be  expected,  again  the  most  regular,  the  graphic  method  being 
next  with  third  differences  averaging  3J  times  as  great,  the  sum- 
mation method  nearly  6 times  and  the  interpolation  method,  again 
last,  7 times  as  great.  The  order  as  regards  smoothness  is  there- 
fore, first  Makeham’s  law,  second,  graphic  method,  third,  summa- 
tion method  and,  fourth,  interpolation  method.  It  is,  in  fact, 
evident  that  when  Makeham’s  law  is  used  the  differences  represent 
those  arising  from  the  law  itself  combined  with  those  arising  from 
irregularities  due  to  dropped  fractions.  Under  the  graphic  method 
the  smoothness  is,  of  course,  limited  only  by  the  judgment  of  the 
graduator  as  to  what  is  permissible  in  the  way  of  departure  from 
the  original  facts. 

96.  From  the  standpoint  of  agreement  with  data  the  following 
table  shows  the  actual  deaths  and  the  deviations  according  to 
each  graduated  table  by  groups  of  ages.  These  groups  are  quin- 
quennial, except  that  at  each  extremity  the  groups  are  combined 
so  that  in  each  group  the  actual  number  of  deaths  is  in  excess 
of  ten. 
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Age 

Group. 

Actual  Deaths. 

Deviations. 

Graphic. 

Interpolation. 

Summation. 

Makeham. 

55-67. . 

16 

+ 5.3 

+ 5.7 

+ 3.7 

+ 10.3 

68-72. . 

44 

-10.7 

- 9.2 

- 6.6 

- 4.9 

73-77.. 

65 

+ 1.7 

+ 2.8 

- .4 

- .9 

78-82. . 

92 

+11.4 

+11.2 

+10.4 

- 1.1 

83-87. . 

112 

-13.1 

-19.9 

- 9.3 

-19.7 

88-92. . 

49 

+ 4.7 

+ 3.3 

+ 3.1 

+ 9.2 

93-99. . 

20 

+ 1.2 

+ 5.5 

.0 

+ 7.2 

398 

±48.1 

±57.6 

±33.5 

±53.3 

97.  We  have,  therefore,  five  items  for  comparison,  (1)  the  sum 
of  the  deviations  with  regard  to  sign  or  the  deviation  in  the  total 
deaths;  (2)  the  sum  of  the  accumulated  deviations,  with  regard 
to  sign;  (3)  the  sum  of  the  individual  deviations  without  regard 
to  sign;  (4)  the  sum  of  the  accumulated  deviations  without 
regard  to  sign;  (5)  the  sum  of  the  group  deviations  without  regard 
to  sign.  These  are  shown  in  the  following  table  with  a sixth 
column  added,  giving  the  expected  or  average  value  of  the  fifth 
item. 


(i) 

(2) 

(3) 

(4) 

(5) 

(6) 

Graphic 

+.5 

- 9.9 

91.9 

132.9 

48.1 

40.5 

Interpolation .... 

-.6 

-15.1 

98.6 

168.9 

57.6 

40.6 

Summation 

+.9 

+17.8 

84.3 

106.4 

33.5 

40.5 

Makeham 

+.1 

+ 3.4 

101.9 

253.2 

53.3 

41.1 

98.  The  figures  in  the  first  column  are  not  significant,  being 
such  as  might  arise  from  dropped  fractions.  This  is  scarcely 
true  of  the  second  column,  except  for  the  last  item,  but  even  here 
the  largest  item  represents  a variation  of  less  than  one  twentieth 
of  a year  in  the  average  age  at  death.  From  the  third  and  fourth 
columns  it  is  seen  that  for  individual  ages  the  summation  gradua- 
tion agrees  most  closely  with  the  original  facts  and  is  followed 
in  order  by  the  graphic,  the  interpolation  and  the  Makeham 
graduations.  The  group  deviations  by  the  interpolation  gradua- 
tion are,  however,  greater  than  by  the  Makeham  graduation. 
Comparing  columns  (5)  and  (6)  the  group  deviations  appear  to 
be  less  in  the  summation  graduation  than  the  expected  and 
greater  in  the  others.  If  we  examine  the  column  of  accumulated 
deviations  in  the  comparative  tables  given  under  each  graduation 
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we  find  8 changes  of  sign  for  the  graphic,  9 changes  for  the  inter- 
polation, 10  changes  for  the  summation  and  8 changes  for  the 
Makeham  graduation.  The  variations  in  this  respect  are  therefore 
not  material. 

99.  Taking  all  these  considerations  into  account  it  would  appear 
that  the  Makeham  graduation  must  in  this  case  be  considered 
unsatisfactory  on  account  of  the  large  accumulated  deviations 
shown  in  column  (4)  and  the  marked  excess  of  the  group  deviations 
over  the  expected.  So  also  must  the  interpolation  graduation 
on  account  of  the  excess  of  group  deviations  combined  with  the 
deficiency  in  smoothness.  In  the  choice  between  the  other  two 
it  is  to  be  noted  that,  although  the  sum  of  the  third  differences  is 
practically  the  same  in  the  graphic  graduation  as  in  the  summa- 
tion, the  greater  part  of  the  former  total  arises  from  the  least 
important  group.  It  would,  in  fact,  have  been  greatly  reduced 
had  values  of  qXJ  derived  from  the  formula,  been  used  at  the  older 
ages  in  the  standard  table  instead  of  those  derived  from  the  lx 
column.  This  fact  combined  with  the  smaller  third  differences 
over  five-year  intervals  gives  the  preference  on  the  score  of 
smoothness  to  the  graphic  graduation.  The  fact  that  the  group 
deviations  in  the  summation  graduation  are  materially  less  than 
the  expected  also  indicates  that  it  follows  too  closely  the  major 
irregularities  of  the  original  data.  The  preference  on  the  whole, 
falls  therefore  to  the  graphic  graduation,  although  the  excess  of 
the  group  deviations  over  the  expected  indicates  that  it  could  be 
improved  by  bringing  it  more  closely  into  harmony  with  the 
original  data  in  those  sections  where  the  deviation  is  the  greatest. 

100.  In  considering  the  method  of  graduation  to  be  adopted  in 
any  particular  case,  whether  the  series  to  be  graduated  is  a fre- 
quency distribution  or  a series  of  ratios,  as  in  the  case  of  a mor- 
tality experience,  it  is  evident  that  a graduation  by  mathe- 
matical formula  possesses  an  advantage  over  all  others  on  the 
score  of  smoothness.  And  where  at  least  two  arbitrary  constants 
are  available  the  sum  of  the  deviations  and  the  accumulated  devia- 
tions can  be  made  to  vanish.  In  view  of  these  advantages  a 
mathematical  formula  will  be  the  best  provided  two  further  condi- 
tions are  satisfied.  The  first  is  that  the  total  irrespective  of  sign 
of  the  deviations,  in  suitable  groups,  should  not  materially  exceed 
the  expected.  The  amount  of  excess  which  is  permissible  will 
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depend  on  the  number  of  groups  used  in  the  comparison.  The 
second  condition  is  that  any  special  feature  known  to  be  char- 
acteristic of  the  series  should  be  reproduced  by  the  formula.  In 
deciding  this  point  an  examination  of  allied  series  should  be  made 
to  see  if  the  same  special  feature  occurs.  If  it  does  not  and  if 
there  is  no  assignable  cause,  other  than  accidental  fluctuation, 
for  the  appearance  of  the  special  feature  in  the  series  in  question, 
it  should  be  considered  as  entirely  accidental  and  the  mathe- 
matical formula  used  even  though  it  does  not  reproduce  it.  For 
the  graduation  of  mortality  tables,  Makeham’s  law  possesses 
additional  advantages  owing  to  its  special  adaptation  to  the  calcu- 
lation of  joint  and  contingent  benefits  in  connection  with  insurance 
or  annuity  transactions.  For  any  table,  therefore,  which  it  is 
expected  to  use  for  such  purposes  that  law  should  be  used  if  pos- 
sible, a very  liberal  interpretation  being  given  to  the  two  conditions 
above  mentioned. 

101.  Where  a mathematical  law  cannot  be  applied  it  will 
usually  be  found  that  where  the  data  are  very  scanty  the  graphic 
method  will  produce  the  best  results  as  irregularities  will  occur 
of  wide  range,  such  as  neither  the  interpolation  nor  the  summation 
method  is  competent  to  remove.  The  interpolation  method  may 
be  used,  however,  in  combination  with  the  graphic,  the  latter 
being  used  instead  of  Mr.  King’s  formula  to  determine  the  points 
upon  which  to  interpolate.  The  points  will,  of  course,  be  subject 
to  subsequent  modification  if  necessary,  just  as  the  curve  is  subject 
to  subsequent  amendment  in  the  regular  graphic  method.  This 
amounts  to  the  substitution  of  an  analytical  interpolation  for  the 
graphic  between  the  selected  points. 

102.  Where,  however,  the  data  are  more  extensive  so  as  to  give 
a satisfactory  degree  of  regularity  under  the  operation  of  the 
interpolation  or  of  the  summation  method,  those  methods  will  be 
the  more  satisfactory  as  the  values  derived  do  not  depend  on  the 
judgment  of  the  operator  except  as  exercised  in  the  selection  of  the 
particular  graduation  formula  to  be  used,  and  they  can  be  obtained 
to  a greater  degree  of  accuracy  than  is  possible  in  reading  them 
from  a diagram.  As  between  the  two  methods,  interpolation  will 
probably  be  found  the  more  useful  in  connection  with  census 
data  and  other  cases  where  the  original  facts  are  given  in  groups. 
In  other  cases  a combination  of  the  two  methods  may  be  sometimes 
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used,  the  points  for  the  interpolation  being  determined  by  means 
of  a summation  formula  of  great  weight  without  regard  to  its 
smoothing  coefficient  and  the  interpolation  being  depended  upon 
to  introduce  the  necessary  smoothness. 


Note. 


In  the  course  of  this  study  we  have  had  occasion  to  refer  to 
certain  theorems  regarding  the  results  of  repeated  trials,  the  proof 
of  which  may  not  be  available  for  reference  by  the  student. 
They  are  accordingly  collected  in  this  note.  Where  the  prob- 
ability of  an  event  happening  at  each  trial  is  p and  that  of  its 
failing  is  q = 1 — p the  probability  that  in  n trials  it  will  happen 
x times  and  fail  n — x times  in  any  assigned  order  is  evidently 
pxqn~x.  As,  therefore,  there  are  in  all  |n/(|&  \n  — x)  different 
ways  in  which  the  event  may  happen  x times  and  fail  n — x times 
the  total  probability  of  exactly  x successes  out  of  n trials  is 

n 

= VxQn~x. 

— X 

To  determine  then  the  expected  number  of  successes  we  multiply 
each  number  by  its  probability  and  sum,  the  result  being 


pXgn-x 


= npT, 


In  - 1 


x — 1 \n  — x 


px-lqn-r 


= np(p  + q)n~l  = np. 


Similarly  the  mean  value  of  x(x  — 1)  is  equal  to 


]C  x(x  - 1) 


\x  n — x 


pxq* 


= n{n  — 1 )p2 


n 


n — x 


px—2qn—x 


n(n  — 1 )pl 


and  generally  the  mean  value  of  x(x  — l)(z  — 2)*  • • (x  — r + 1) 
is  n(n  — \){n  — 2)  • • • (n  — r + 1 )pr. 

Thus  in  the  notation  of  moments  we  have,  taking  zero  as  origin, 


mi  = np , 

— mi  = n(n  — l)p2, 
m3  — 3m2  + 2mi  = n(n  — 1 )(n  — 2 )p3, 
m4  — 6 m3  + 11^2  — 6mi  = n(n  — l)(n  — 2)  (n  — 3)p4. 
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Whence 

rrh  = n2p 2 + npq , 

m3  = n3p3  + 3n2p2g  -f  npq(q  — p), 

m4  = n4^4  + Qnsp3q  — n2(llp2  — 18  p2q  — lip4) 

+ n{6p  — llp5  + 6 pq(q  — p)  — 6p4} 
= n4p4  + 6 n3p3q  — n2p2g(4p  — 7g)  + npq(  1 — 6pg). 

Or  transferring  to  the  mean  value  as  origin 


= npq , 

Ms  = np?($  - p), 

M4  = np^(l  — 6 pq)  + 3 n2p2q2. 

To  determine  the  mean  value  of  the  departure  from  the  expected 
number,  without  regard  to  sign,  we  must  make  two  separate 
summations.  If  we  designate  by  l the  largest  integer  contained 
in  np  we  have  for  this  mean  value 


2 (np  - x) 


\n  x \n 

— pxqn~x  + 2L,  (x  — np)  - — 

\x\n  — x i+i  \x  \n  — x 


pXqn- 


= 2 3 {(»  - *)p  - xq]  L —px<r 
o \x  In  x 


In 


= y;_ 

o \x  \n  — x — 1 


+ E \xq  - (n  - x)v\  — p= 

i+ i \x  \n  - x 

1 | n 


pxq* 


px+lqn-x  _ ^ 


i \x  — 1 n — x 


pxqn—x+ i 


+ E-, ft p-y 

[a?  — 1 \n  — x v x 


n- 1 

X/yn—X+l  


TTi  I*  ~ g ~ 1 


px+lqn—x 


l 

= E 


Z-l 


o'  |z  |n  — x — 1 


pz+1^ 


r_x  - E 


r'  [a;  |n  — x — 1 


px+lqn—x 


n 


n- 1 

E 

z+i 


= 2 


+ E ~T~r~"~ — Tpx+y-x  - E 

/ [x  |n  — x — 1 
In 

L pl+lqn— l' 


\n 


|z  |n  — x — 1 


px+lqn—x 


\l\n-l-l 


In  order  to  reduce  this  expression  to  a simple  form  it  is  necessary 
to  substitute  an  approximate  expression  for  the  factorials  involved. 
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We  have 

loge  T = 2 { 2 n - 1 + 3 (2  n - l)3  + 5 (2  n - l)6  4 } ’ 

whence 

/ 1 \ , ^ ,1  1 ,1  1 

VW  2 / 0ge  n — 1 1 3 (2w  — l)2  + 5 (2n  - l)4  + 

This  is  positive  for  all  values  of  n and  less  than 

!/_J + ^ +...]  = h i—l"1 

3 1 (2»  — l)2  T (2ra  - l)1  T J 3(2n  — l)2 1 (2»  - 1)2J 


1 


But 

(n  - i)  log. 


12w(n  - 1)  12(n  - 1) 


1 

12n ' 


n 


- 1 


log  en 


n — 1 

= in  + i)  loge  n - (n  - §)  log*  (n  - 1) 

= { (n  + §)  loge  n - log*  [w  - n} 

- {(n  - J)  loge  (n  - 1)  - loge  1 (n  - 1)  - (n  - 1)}. 
Designating  now  {{n  + §)  logen  — loge  \n  — n\  by  f(n)  we 
see  from  this  equation  that  Af(n  — 1)  is  always  positive  and  less 

than  | i2(n  — \ ~ } * ^or  an^  va^ue  n greator  than  unity 

we  have  therefore  since /(l)  = — 1 

-!</(»><- -H- 

When  n is  indefinitely  increased  f(ri)  must  tend  to  a definite  limit 
which  we  may  designate  by  — c.  For  any  finite  value  of  n the 
value  of  this  function  must  differ  from  its  ultimate  value  — c 
by  less  than 

jhf-L_  1 

n 1 12n 


12(»  + 1) 


] = -L. 

J 12n 


We  have  therefore 

(n  -f  i ) loge  n — loge  \n  — n=  — c - 


12  n ’ 


where  0 < 0 < 1 or 


loge  |n  = (n  -f  |)  loge  n - n -f  c + 


12n 
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where  the  last  term  may  be  neglected  if  n is  large.  The  exact 
value  of  c is  determined  from  the  fact  that  7 r/2  is  the  limit,  when  n 


is  indefinitely  increased,  of 


22n(\n)\2n  + 1)  24n(|n)4(2n+l) 

L or  of  — 


l2-32-  • • (2n+l)2  (|2n  + l)2 

Hence  loge  tt  is  the  limit  of 

+ 1)  loge  2 + 4 loge  \n  — 2 loge  |2 n — loge  (2n+l) 

= (4 n + 1)  log  2 + 2(2 n + 1)  loge  n — 4n  + 4c 

— (4 n + 1)  loge  2n  + 4n  — 2c  — loge  (2 n + 1) 
= loge  n + 2c  — loge  (2 n + 1)  = 2c  — logc  2,  in  limit. 

Hence 

2c  = log  e 2tt, 
c = J loge  2ir. 


Hence  we  have 


|n  = ^2r 

when  n is  large. 

Returning  then  to  the  expression 


n+(  1/2) 


en 


I? 


\L  \n  ~~ 1 1 


pl+lqn- 


or 


2 | n(n  — l) 
~"| l \n  — l 


pl+lqn- 


for  the  mean  departure  and  putting  l = np  — k,  where  h is  frac- 
tional, and  consequently  n — l — nq  + k the  expression  takes  the 
form,  when  the  approximate  values  of  the  factorials  are  used, 

2 

-=nn+*(nq  + k)(np  — k)~np~*+k(nq  + k)~nq~*~kpnp+l~kqnq+k 

v2tt 


Q+h-h 


_ ^ J2npq 


(approximately)  since  k is  small  compared  with  np  or  nq. 
Hence  the  expression  reduces  to 


= .79788  ^npq  = f V npq . 


Again,  putting  np  + x for  x in  the  expression  for  the  probability 
of  exactly  x successes  in  n trials  and  so  transferring  the  origin 
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to  the  expected  number  it  becomes 

In 

1= pnp+Xqnq—X' 

| np  + % \nq  — x y * 

Substituting  then  the  approximate  values  of  the  factorials  this 
becomes 


V27T 


nn+i(np  + x)~np~x~*(nq  — x)  ~n  ^~ipnP+xgnp-x 


^2irnm  V np)  \ nl) 


—n  q+x—i 

f 1 4-  — 1 f 1 — — l 

V 2irnpq 

If  now  we  suppose  np  and  nq  to  be  very  large  so  that  x/np 
may  be  neglected,  but  that  x2/np  may  not  be  neglected,  the 
logarithm  of  this  expression  becomes 


- \ log  2t xnpq  - (np  + * + J)  log  ( 1 + ~ ) 

\ nV  / 

- (nq  - x + i)  log  ( 


nq) 


— — | log  2?r npq  — (np  + x + 


*»(.!  - 


2 n2p2 


+ 


) 


+ (nq  - x + 


+ 


nq  ' 2 n2<f 


= - i log  2impq  - _ - — = - J log  2 «pq  - ^ 


and  the  expression  itself  becomes 

1 


^2irnpq 


e-x*l(2npq)' 


Where  np  and  nq  are  both  large  therefore  the  curve  y — — = e' 

C Vir 


-x2/C2 


approximately  represents  the  probabilities  of  the  various  depar- 
tures when  c2  = 2 npq  and  from  tables  which  have  been  formed  of 
the  integral  of  this  expression  it  is  found  that  there  is  approximately 
an  even  chance  of  the  departure  exceeding  2/3  V npq . 
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